doc - Personal World Wide Web Pages

advertisement
University of Southern California
MARSHALL SCHOOL OF BUSINESS
Spring, 2004
Course Guidelines & Syllabus
IOM 528 – DATA WAREHOUSING, BUSINESS INTELLIGENCE AND DATA MINING
Instructor:
Dr. Arif Ansari
Office:
HOH 400 D
Office Hours: Tuesday and Thursday 12.00-1.00
Office phone: (213) 821-5521
Email: aansari@marshall.usc.edu
Emergency Contact number: 213-740-0172
TA: Gayatri Ratnaparkhi
Office: HOH 400 D (Hoffman Hall)
email: ratnapar@usc.edu
Office hours: TBA
COURSE OBJECTIVES
 To develop an understanding of the various concepts and tools behind data
warehousing and mining data for business intelligence.
 To develop quantitative skills pertinent to the analysis of data from huge
corporate data warehouses
Overview:
This course is about how companies apply two new technologies, data warehousing
(DW) and data mining (DM, including business intelligence, BI) to empower their
employees, and build and manage a customer-centric business model. Besides learning
the strategic role DW and DM plays in an enterprise, you will also get a close-up look at
DW and DM by working on cases and gaining hands-on experience using software tools.
Students taking this class will get an overview of the technologies of DW and BI/DM
from a managerial perspective.
Fortune 500 companies such as American Express and Wal-mart have accumulated a
great deal of data from their day to day business. Data warehouse is the technology that
integrates the data collected from various sources that include transaction processing
systems and e-commerce data collecting systems. Collecting and integrating data is just
the first step. What is really critical is information, knowledge and insight. So the
question is, what is the utility of the data? How can one use data in managing customer
relationship and empowering employees? How can one uncover patterns and
relationships hidden in organizational databases? These issues are addressed by a fast
growing body of research and applications, broadly known as business intelligence
1
(BI)/DM. These technologies draw their strengths from the fields of information
technology, statistics, machine learning and artificial intelligence.
In summary, managers need to understand the strategic values of their company's
information assets. DW, BI and DM are cornerstones of the infrastructure that leverages
these assets.
Course Objectives:
To develop an understanding of the strategic values of various concepts behind
warehousing and mining data. To develop an understanding of concepts in DW, BI, and
DM, and to gain hands-on experience of some DW, BI and DM software tools.
After taking this class, students should be able to :
 Understand the basic terms that are used in DW, On Line Analytical Process
(OLAP), BI and DM
 Communicate to Information Technology workers their business perspective in
terms of the language of DW and DM
 Choose appropriate tools for specific purposes of storing, integrating and
analyzing data (business consideration, and technical consideration
 Use tools provided in class to perform simulated tasks in warehousing data
 Use tools to perform BI/DM activities on moderately large data sets in case
studies. Students should be proficient with at least one of these software tools
 Articulate and present the results of their analyses and the business implications
of these results
 Gain inference from your analysis , from a statistical point of view.
Structure of lectures:
IOM 528 will be organized in a way that include some combination of the following:
lectures, case-based class discussion, group project, computer lab work, and maybe guest
speakers.
Course Materials. The following items will be necessary for completion of reading
assignments and homework.

The first book is a standard Data Mining, Introductory and Advanced topics (
Margaret H. Dunham) book , focused on business applications that we will use for
our readings.

IOM 528 Course Pack, Data Warehousing Business Intelligence and Data Mining
(This reader is non-returnable. It cannot be exchanged for cash or credit. Please be
sure you are permanently enrolled in the class before purchasing!)

Data Warehousing: using the Wal-mart model. Paul Westerman, Morgan Kauffman
publishers.

Class notes.
2
Class notes for this class will be available on blackboard. You should familiarize yourself
with these notes before they are covered in class. You will be using different softwares to
describe and analyze data.
Important dates:
Class Registration:
January 30: Last day to register and add class
January 30: Last day to drop a class without a mark of “W”
April 9: Last day to drop a class with a mark of “W”
Midterm exams:
TBA
Final Exams:
May 4 ,2004 ,Tuesday 7.00-9.00 pm.
Grading.
•
•
•
•
Midterm 20% (take-home)
Project + case presentation 20%
Final 40%
Homework + case studies 20%
Homework
Homework assignments will be distributed via blackboard. Homework is extremely
important to your learning the material in the class. Homework assignments may be
discussed with members of your team ( 2 or 3 students) . You have the following
objectives on your homework assignments:
 Answer the question you were asked.
 Argue clearly and concisely that your answer is correct.
We will judge your homework assignments by how clearly you communicate and
understand the material. Remember that nothing conveys clear thinking like clear
writing. The definition of clear writing includes the appropriate use of and reference to
computer output. If you examined certain graphs and/or printouts when arriving at your
solution then include that output in your report so that the reader can follow your logic to
your conclusion.
Computer output should be clearly labeled and referred to in the text. Ideally, the output
should be placed in a figure close to the textual reference. Including large sections of
3
computer output without reference in the text is a signal to the TA that you are not sure
what is important and what is not and will likely count against your grade.
If you believe that an error has been made in the grading of your homework you may ask
to have it regarded. Please be specific about the problem. If you are still concerned after
this process you may come and see me.
If you do not agree with the TAs grading, you may appeal your solution to me. Note,
however, that I will review your entire assignment and will include in my assessment of
your grade your oral arguments as well. I am a tougher grader than the TA, so be
prepared when you see me. I reserve the right to adjust your grade up or down as I see
fit.
Review Session. There will be a review session before the exams.
Academic Integrity. Academic dishonesty of any type will not be tolerated in this class.
Students who find this statement ambiguous should consult the Student Conduct Code,
page 83, of the USC SCampus handbook.
A comment about writing the assignments up individually and working in teams: You
can work together in teams to discuss the problems and concepts. However, you are
required to write up the assignments individually. This means that all the words in you
assignments are your own, and you generate all of your own computer output and graphs.
Now, while correct solutions will have very similar or even the same computer output, no
two answers should be phrased the same way. If I find two or more assignments that are
highly similar, I will at a minimum give the homework a zero, and may refer the incident
to the Dean. Do not test me on this policy.
STUDENTS WITH DISABILITIES
Any student requesting academic accommodations based on a disability is required to
register with Disability Services and Programs (DSP) each semester. A letter of
verification for approved accommodations can be obtained from DSP. Please be sure the
letter is delivered to me as early in the semester as possible. DSP is located in STU 301
and is open 8:30 am - 5:00 pm, Monday through Friday. The phone number for DSP is
213 740-0776.
4
Tentative Schedule:
The course will start will either Data Mining or Data Warehousing.

Lecture 1: Overview
DATA WAREHOUSING (DW) :

Lecture - DW1: A Strategic View
Data to knowledge to results, Davenport et al., Cal. Mgt Review 2001
Strategic View of DW and CRM, Swift, 2002.
Case Study: Canadian Tire

Lecture - DW2 : A Tactical View
Westerman, Chapter 1, 10, 11.
Walmart's DW, Swift, 2001.
DW Components, Berson & Smith, 1997

Lecture - DW3: Technology of DW
Westerman. Chapter 6, 7.
Relational DB, Computer World 2001
Normalization, Whitehorn & Marklyn, 1998
MetaData, Jennings, DM Review, 2000.
Mass Movement, Russom Intelligent Enterprise, 2001
Case Study : Walmart

Lecture – DW4 : Dimensionally Designed DW (I)
The Business-Driven DW, Adamson & Venerable, 1998.

Lecture – DW5: Dimensionally Designed DW (II)
Hotel Occupany Star Schema, Adamson & Venerable, 1998
Case Study : Star scheme (notes)

Lecture - DW6: OLAP and Business Intelligence
Data Driven Decision Support, Dhar & Stein, 1997
The state of the BI market, Hackathorn, DM Review, 2001.
Business Intelligence Pays Dividends, Baron, Information Week, 2000
5

Lecture - DW7: Web-Based OLAP and Business Reporting
OLAP, Berson & Smith, 1997.
OLAP Goes Online, Baron, Information Week, 1999.
DATA MINING:

Lecture – DM1: Data Mining: an Overview of Application and Privacy
Issues
Mining data: Wasserman, Region Review 2000
Data Mining: what General Managers Need to Know, Jacobs, Harvard
Management Update, 1999.
None of Your Business, Stepanek, Business Week Online, 2000
Case Study : Capital One

Lecture – DM2: Using Data Mining Techniques for Personalization
Personalization dig deep, Colkin, Information
Week, 2001.
Collaborative Filtering, Heylighen, 1999.
Nearest Neighbor Method, Watson, 1997.
Beyond Personalization. Brobst & Rarey, Teradatareview, 2000
Case Study : Firefly Network (now part of Microsoft) ,

Lecture – DM3: Decision Tree and Rule-Based Systems in Business
Applications
Decision Trees, Berry & Linoff, 1997
Case Study : Vermont Country Store

Lecture –DM4 : Decision Tree and Neural Network in Business Applications
Making Brain Waves, Baatz, CIO Magazine 1995

Lecture – DM5: Understanding Neural Network & Data Mining Cases
Artificial Neural Network, Berry & Linoff, 1997.
Case Study : Real estate pricing model for houses in Rochester, MN.

Lecture – DM6: Putting Things Together: CRM & Relationship Technology
6
A framework for CRM, Winer, Cal. Mgt Review
Case Study : Mail Boxes Etc.

Lecture – DM7: Special Topics
7
Download