Spring 2016 - University of Houston

advertisement
ISAM 5332
PREREQUISITE:
DATA WAREHOUSING AND DATA MINING
SPRING 2016
ISAM 5331 – Database Design and SQL or equivalent knowledge.
CLASSROOM & TIME: SSB 3310/3.201.02 (MIS Database Lab); Mondays, 7:00 PM – 9:50 PM
INSTRUCTOR:
Mohammad A. Rob, Ph.D.
Office: SSB Suite 3-202-9
Voice: (281) 283-3191
E-mail: rob@uhcl.edu Course Website: https://mis.uhcl.edu/rob
Office Hours: Mondays, 5-7 PM & Wednesdays: 2-4 PM; and walk-ins &
appointment.
Teaching Assistant: Check course website.
COURSE REQUIREMENTS:
 Required Text: Data Warehousing Fundamentals: A Comprehensive Guide for
IT Professionals, Author: Paulraj Ponniah, Second Edition, John Wiley & Sons;
ISBN: 978-0-470-46207-2.
COURSE FORMAT:

Required Material: Instructor’s Class Notes - available in the Course Web site.

Required Material: Create a Pivot table in Excel 2013. Online Version from
Microsoft. Available through Instructor’s website.

Required Materials: Microsoft SQL Server 2012 Analysis Services
Multidimensional Modeling Step by Step, Online Version from Microsoft.
Available through Instructor’s website.

Recommended Material: Microsoft SQL Server 2012 Analysis Services Data
Mining Step by Step, Online Version from Microsoft. Available through
Instructor’s website.

Required Tool: Microsoft SQL Server 2012 Analysis Services (available in the
database lab).

Recommended Text: Data Mining Concepts and Techniques, Authors: Jiawei
Han and Micheline Kamber, Morgan Kaufmann Publishers/Academic Press.
FACE TO FACE
COURSE DESCRIPTION, GOALS, AND LEARNING OUTCOMES:



This course will acquaint students with the fundamental knowledge of data
warehouse modeling and design.
It will also provide students with the knowledge of tools and techniques of data
warehouse development in a corporate business environment.
It will further familiarize students on the tools and techniques of data
interrogation of warehouse data using Pivot tables and OLAP (Online Analytical
Processing) methodology.
1



It will prepare students for future careers in data warehouse planning, analysis,
design, and implementation as well as making business forecasting using
decision-making tools.
It will help students to pass data warehousing certification.
If time permits, there will be discussions on data mining concepts and techniques.
COURSE ACTIVITIES: The course will contain lectures, homework assignments, a group project, group
presentations, a test, a research paper and two reports. At least 50% of the activities
will be hands-on practice by students using Microsoft tools such as Excel, SQL
Server Data Warehouse, Analysis Services and Visual Studio.
RESEARCH PAPER:
Students will write a paper understanding the current job prospect of a data
warehouse analyst. Refer to the course web site for the details of the paper
requirements.
GROUP PROJECT:
Students will work in groups to model, design, and develop a data warehouse. Each
group will choose a business industry such as airline, education, retail, financial,
insurance, hospitality, investment, and healthcare. They will then develop/collect
day-to-day business data of various formats which can be found in files, databases,
spreadsheets, or text documents. They will then develop a strategy to transfer these
data in a common format and in summarized form to a data warehouse developed in
Microsoft Access. They will then transfer that data into a SQL Server data
warehouse. Finally, students will apply OLAP tools to extract meaningful business
intelligence reports on customers, products, purchases, and so on. Refer to course
website for details. Each group needs to submit a final report as outlined in the
website.
GROUP PRESENTATION: Each student group will make 2-3 presentations on the project mentioned above.
The presentations will follow a schedule and they will cover topics of (i) project
definition and planning, (ii) data warehouse modeling and expected business
intelligence reports, and finally (iii) demonstrate OLAP implementation and reports.
Each student must participate in the presentation.
ATTENDANCE:
Students are expected to be physically present in the class and participate in the
discussion on presentations by others. Class rolls may be called anytime.
EVALUATION/GRADING POLICY:
Midterm
50%
Homework
10%
Group Project - Presentation
10%
Group Project - Final Report
10%
Individual Report – Analysis Services/Excel Tabular Modeling 5%
Discussion/Attendance
5%
Research Paper
10%
____________________________________________________________________
Total
100%
2
CERTIFICATION:
Students performing poorly in the test may improve their score by passing a
professional certification exam such as Microsoft Exam 70-463: Implementing a
Data Warehouse with Microsoft SQL Server 2012. At best 10% of the overall course
grade can be earned through the certification. Refer to the course website for details
on certification.
GRADE DISTRIBUTION:
A–
B–
C–
D–
F
=
=
=
=
=
90 – 93,
80 – 83,
70 – 73,
60 – 63,
59 and below
A
B
C
D
=
=
=
=
94 – 100,
84 – 86,
74 – 76,
64 – 66,
B+ = 87 – 89,
C+ = 77 – 79,
D+ = 67 – 69,
OTHER INFORMATION:
A.
Class attendance: Regular class attendance is required and rolls may be called anytime.
B.
UHCL Software: http://software.uhcl.edu
C.
Missing Tests and Assignments: Missing tests and laboratories will be counted as zero. Make-up of
missing tests and any late submission of assignments will be acceptable only under extreme emergencies.
D.
Academic Honesty: The Academic Honesty Policy at UHCL (found on the Dean of Students’ website,
the Faculty Handbook, the Student Handbook, the Senior Vice President and Provost’s website, the
Graduate Catalog, and the Undergraduate Catalog) states: Academic honesty is the cornerstone of the
academic integrity of the university. It is the foundation upon which the student builds personal integrity
and establishes a standard of personal behavior. Because honesty and integrity are such important factors
in the professional community, you should be aware that failure to perform within the bounds of these
ethical standards is sufficient grounds to receive a grade of "F" in this course and be recommended for
suspension from UHCL. The Honesty Code of UHCL states "I will be honest in all my academic activities
and will not tolerate dishonesty."
COPYING FROM ONLINE BOOKS & FROM EACH OTHER WILL NOT BE TOLERATED.
ABSOLUTELY NO CELL PHONES WILL BE ALLOWED DURING THE TEST.
E.
Special Academic Accommodations: If you believe you have a disability requiring an accommodation,
contact Disability Services at 281‐283‐2648 or disability@uhcl.edu as soon as possible and complete their
registration process. The University of Houston System complies with Section 504 of the Rehabilitation
Act of 1973 and the Americans with Disabilities Act of 1990, pertaining to the provision of reasonable
academic adjustments/auxiliary aids for students with a disability. In accordance with Section 504 and
ADA guidelines, each University within the System strives to provide reasonable academic
adjustments/auxiliary aids to students who request and require them.
F.
Incomplete Grade: A grade of “I” (Incomplete) will be administered only under extreme, verifiable
emergency” situation where the student in unable to complete some minor portion of the course work due
to circumstance beyond his/her control provided the student is passing the course.
G.
LAST DAY TO DROP/WITHDRAW FROM THE SEMESTER: APRIL 12, 2016.
3
COURSE SCHEDULE
(SUBJECT TO CHANGE IF DEEMED NECESSARY)
Date-2016
Lecture & Skill/Activity
Due Dates
January 25
Syllabus Review & Introduction
February 1
Background and definitions of data warehouse, data
marts, and data mining (Chapter 1 of Ponniah)
February 8
The data warehouse architecture (Chapters 2, 6, and 7
of Ponniah)
February 15
The principles of dimensional modeling (Chapter 5 and
10 of Ponniah)
Submit Review Questions - Chapter 1 & 2
February 22
Advanced topics in dimensional modeling (Chapter 11
of Ponniah)
Start of First Group Presentations – Problem
Definition and Planning
February 29
Online Analytical processing (Chapter 15 of Ponniah)
Submit DW Analyst Paper
March 7
Introduction to SQL Server Database and Create a new
Database
Submit Review Questions - Chapters 5 & 10
March 14
Spring Holiday
March 21
Create a Pivot Table in Excel using Contoso Database
March 28
Test-I: Essay/short answers - Chapters 1, 2, 5, 10, 11
and Class Notes
Submit Review Questions - Chapters 11 & 15
April 4
SQL Server 2012 Analysis Services: Lesson 1
Start of Second Group Presentation –Multidimensional Data Modeling
April 11
SQL Server 2012 Analysis Services: Lesson 2
Submit Completed Contoso Pivot Table in
the Web Folder
April 18
SQL Server 2012 Analysis Services: Lesson 3
Submit individual Report on Analysis
Services (see below)
April 25
Implementing the Group Project
Start of Third Group Presentation –
Demonstration and OLAP Reports
May 2
Continue working on the Group Project
May 9
Finishing the Group Project
Formation of Groups (3 students)
Submit Final Project Report (see below)
4
Individual Report on OLAP Test Project
After learning to develop Cubes through SQL Server Analysis Services, each student will be required to
document through texts and screen shots, the steps of test Cube development and results of at least three
OLAP queries from the Excel Pivot Table. In this case, students will be using Visual Studio/SQL Server 2012
along with a sample SQL Server AdventureWorks data warehouse to create a business intelligence project
and go through the steps following the note: SQL Server 2012 analysis Services Multidimensional Modeling
Step by Step. Refer to course website for a sample report. Again, your report should be about 10-15 pages
long and it should contain 3-5 OLAP reports along with your analysis and conclusion.
Group Project Final Report
You are to write a report for the group project you have done during the semester. The paper should clearly
describe the concepts and purpose of data warehousing, the business scenario of your project and proposed
outcome, the approach of dimensional modeling, the design of data warehouse, the research methodology
including data gathering, data cleansing, data transfer, as well as development of data warehouse and OLAP
cube using a specific tool such as SQL Server Analysis Services 2012. The implementation part should
describe dimensions and facts as well as the reasons to consider them. The results should provide appropriate
strategic information for your business.
Use appropriate texts and graphics to describe and display necessary activities and data. The following is a
typical outline of the paper; however, you are free to change the titles of the sections. But the sections must
follow typical steps of your project development - beginning from the business scenario and ending with the
explanation OLAP results along with figures and graphs:
Title
Name of the Students and affiliation
Abstract
Business Scenario: Problem statement, data origins, and expected value
Why Data Warehousing?
Methodology
Defining raw data format
Data cleansing
Data Transfer (Text file, Excel, Access/SQL Server)
Dimensional Modeling: Defining dimensions, facts, attributes, and hierarchies, database schema
DW implementation (in Access/SQL Server): Screen capture of relationships, dimensions, facts
Cube Implementation and OLAP in SQL Server
Results: Screen capture of drill-down, roll-up, slicing and dicing, graphs, along with explanations
Conclusion and Discussion
5
Download