pptx

advertisement
Final Exam Review
Spring 2011
Exam format
 About 75 questions
 45% multiple choice and T/F
 30% short fill-ins
 25% short-paragraph explanations
What to study
 50 questions from Exam 1 and 2
 12-15 questions about presentation topics
 10-13 questions will come from labs
 Concepts from
 Market Basket: Data Mining
 SCM: RFID (hardward) + XML(concept)
 Fund Trading: DBs for Optimization
 Pivot Chart: DBs for Discovery & Prediction
 Wagemart: DBs for Decision Support
Market Basket Analysis
 Support: Probability (P) that an item is in someone’s checkout basket
A,B,E
A,B,F
A,B
A,B,F,G
A,D,F
C,D
C,D,G
E,F,G
E,F
E,G
 P(A) = 5/10 = 50%
 P(AB) = 4/10 = 40%
 P(C) = 2/10 = 20%
 P(CD) = 2/10 = 20%
Market Basket Analysis
 Confidence X  Y = P(XY)/P(X) : If item X is purchased, what is the
probability that item Y is also purchased
 Confidence B  A
= P(AB)/P(A)
= 40%/50%
= 80%
 Confidence C  D
= P(CD)/P(C)
= 20%/20%
= 100%
 Given: P(A) = 5/10 = 50%
 P(AB) = 4/10 = 40%
 P(C) = 2/10 = 20%
 P(CD) = 2/10 = 20%
Market Basket Analysis
 Quality X  Y = Confidence X  Y * P(YX)
 High quality association rules
 Quality A B
= 80% * 40%
= 32%
 Quality C D
= 100% * 20%
= 20%
Apriori Algorithm:
 Calculate high quality association rules given
 billions of transactions
 millions of items
 Complex Association Rule
ADGMS  CLPT (50% quality, 80% confidence), i.
 5 items (A,D,G,M, and S) imply with great
confidence that 4 items (C, L, P, and T) are
purchased.
 Without the Apriori Algorithm, the calculation
would take too long (millions of years).
Apriori Algorithm
 How it works:
 By setting minimum support level, the algorithm can
prune low confidence pairs (2-itemsets) to compute
3-itemsets.
 Then, the pruned 3-itemsets can compute 4itemsets. The algorithm is guaranteed to return all
the itemsets above the minimum support level.
 When you get to 5-, 6-, or 7-itemsets, the pruning
reduces the number of possible sets from trillions to
a few thousand or hundred, which can help
humans discover very complex, high quality
association rules.
Importance of Apriori
Algorithm
 A process
 An innovation that takes terabytes of data and
reduces it to meaningful rules
 Raw Data  Relevant and Timely Information
 A.I.
 Data Mining
 Market Basket Analysis
Pivot Chart Lab
 Great example of Online Analytical Processing
(OLAP)
 Slice & Dice data (Temp., Mood, Day, Weather)
 Drill Down (look at only incorrect predictions)
 Unlike Data Mining,
 the process is interactive
 a person participates in the process
 The process is Ad. Hoc.
 The process is not pre-determined like Apriori Algo.
Significance of Pivot Chart
Lab
 Business Intelligence (like A.I.)
 Use OLAP to find patterns
 Encode patterns as IF statements to predict
future cases.
 The spreadsheet can automate the human
decision making process on a large scale, faster
than a human.
 Such a system enables timely, accurate
predictions without a human decision-maker
(Business Intelligence System)
Excel Pivot Charts as a tool
 First: Pattern is noticed
 Second: Interactive analysis tools (Pivot Chart)
helps to confirm and pin-point the pattern
 Example: A marketer thinks that geography plays
a role in sales; a Pivot chart shows that Southern
stores do have better sales.
Database queries as tools
 First: The data mining reveals numerous patterns
(association rules)
 Second: Human intelligence can derive the
theory behind the pattern.
 Example: The Apriori algorithm discovers a high
quality association rule (Beer  Diapers). Later,
Marketers try to unravel the reason why.
 The data analysis must come before the hypothesis
because the data is too big for humans to analyze.
Fund Trading Lab
 Decision Support Automation: Using a Database to
compute the optimal sequence of trades.
 Too many combinations for a human to analyze
 Another Example of Business intelligence
1. At first we use a graph and human intuition to make
the trades
2. We do better if we use a query to calculate and
sort all possible transactions
3. We use Database tools to pick the best one’s that
don’t overlap
Decision Support Systems:
Wagemart vs. Fund Trading
Wagemart
Fund Trading
 start with tons of data
 start with less data
 individual salaries,
availability
 reduce it to simple info
 total cost, average rating
 to help make a decision.
 Fund value for each day
 compute every possible
transaction
 Much more data
 Queries are used to find the
optimal transactions
Decision Support Systems:
Wagemart vs. Fund Trading
 Both system model scenarios to compute the
outcome of decisions
 one is structured
 one scenario to optimize
 the other unstructured
 many different scenarios to consider
 Fund Trading was more structured, i.e., you can only
buy and sell; you just have to decide the optimal
day and funds to buy/sell.
 Wagemart was very unstructured, many different
ways to cut costs.
Porter’s 5-forces
 Do companies complete because its fun?
 Maybe some…
 They compete because of the threat of going
out of business.
 Profitability is the penultimate measure of success
 Why?
What are the threats?
 A new competitor
 Will take away your sales and profits?
 Because they are better?
 In business what does better really mean?
The five forces/threats
 New entrants
 Substitute products
 Rivalry
 Bargaining power of consumers
 Bargaining power of suppliers
Example
 Target forces their supplier to use XML-formatted
shipment data and boxes tagged with RFID
chips.
 Apple refuses and wins
 Target has to use Apple’s system to sell Apple’s
products.
 What force is this?
Example
 Indirect: Brooke visits Google Shopping and
Shopzilla to compare prices on a new camera.
 She’ll buy from the most inexpensive online retailer
 Direct: Bradley uses Lending Tree.com where
banks try to underbid each other to get his
business.
Example
 Disney World implements a new ride tracking
system, that directs visitors to the rides with
shortest wait times.
 Forces Universal Studios to invest in a similar
system.
Example
 Everyone at the gym is using their iPhone or
Android phone to listen to music
 MP3 players are now collecting dust
Example
 Netflix emerges and puts 120 Blockbuster videos
stores out of business
Competitive strategies
To fight the forces
1. Do something totally new (innovation)
2. Be inexpensive (cost leadership)
3. Be big to increase power (growth)


Lock-in your customers
Lock-out your competition
4. Make mutually beneficial partnerships (alliance)
5. Be different but in a good way (differentiation)
 Put up barriers to the competition
Example
 Imagine if Blockbuster decided to use
Internet/Mail delivery before Netflix.
 But Blockbuster was NOT ______________
 By the way, Netflix created a totally new process for
renting videos.
 How does an IS make this possible?
 How is the IS better than the old-fashioned
process.
E-commerce
 It was an innovation at one point
 Now it necessary to stay in business
Example
 Walmart’s efficient supply chain cuts cost.
 RFID and XML play a role
 Their size allows them to negotiate low prices with
suppliers.
 Large companies absolutely need information
systems for good management
 Walmart’s strategy is 2-fold.
How do Information Systems really
help businesses to compete?
The labs provide many examples
 RFID, XML
 More accessible, timely information for improving
supply chain.
 Market Basket
 More relevant information for increasing sales/profits
How do Information Systems really
help businesses to compete?
The labs provide many examples
 Wagemart
 More accurate information for modeling decisions
 Pivot Chart & Fund Trading
 Flexible information; manipulated in real-time to
solve problems (prediction & optimization)
The 11 information attributes
are fair game
 Flexibility and accessibility are different.
 Putting something on the web makes it more
accessible
 Storing data electronically can make it more flexible
 Putting electronic data in a robust, standardized
format (XML) improves both.
Attribute Trade-offs
 Simple vs. Complete
 Secure vs. Accessible
Presentations
 Don’t forget to review presentations
 The websites will be linked on Tuesday
Textbook Reading
 Low priority
Top Priority
 Review past exams and lookup correct answers
(Text and Google)
 Will post them on Tuesday
 Skim lab materials and instructions on Blackboard
 Create cheat sheet
Download