Kennesaw State University DEPARTMENT OF COMPUTER SCIENCE AND INFORMATION SYSTEMS CS8560: Data Mining Instructor: Dr. Ying Xie Office Number: Office Hours: Phone: Email: CL3019 (678) 797-2143 yxie2@kennesaw.edu Course Description: Due to the wide availability of huge amounts of data and the imminent need for turning data into information and knowledge, data mining has attracted significant interest in recent years with its vast domain of applications ranging from business analysis, scientific discovery, medical diagnosis, and engineering design. This course covers major data mining concepts and techniques for uncovering interesting data patterns hidden in large data sets, including data warehousing and OLAP technology, association mining, classification and predication, clustering analysis, and time-series analysis. Prerequisites: CS 8530 Database Administration Textbooks: Data Mining: Concepts and Techniques, Second Edition (The Morgan Kaufmann Series in Data Management Systems) by Micheline Kamber Jiawei Han Learning Objectives: Understand the following data warehousing techniques: Dimensional modeling, ETL, OLAP Be able to design and implement a data warehouse. Understand major data mining algorithms: including association mining, clustering, classification, and time series analysis Be able to implement data mining applications Learning Outcomes: 1. Students will demonstrate skills on data warehouse design and implementation 2. Students will demonstrate skills on applying data mining algorithms and tools to extract various patterns from different types of data 3 Students will demonstrate skills on data mining algorithm design Assignments and Course Project A series of assignments will be given for students to reinforce the concepts learned in class. Students are also required to work on a course project that requires hand-on practices on the core data warehousing and mining technologies. Assessment and Grade Evaluation: Assignments* Course Project* Midterm Test* Final Test* 25% 25% 25% 25% A 90% - 100% B 80% - 89% C 70% - 79% D 60% - 69% F Below 60% * Penalty for late submission applies to all assignments and tests Course Topics (Tentative, subject to change): Topic Data mining introduction Dimensional model OLAP techniques ETL technology Association mining Classification Clustering Time series analysis Data mining applications 1 weeks 2 weeks 1 weeks 2 weeks 1.5 weeks 1.5 weeks 1.5 weeks 1.5 weeks 2 weeks Course Materials All course materials will be posted at http://csmoodle.kennesaw.edu. Please register to this website by using your KSU student email (netid@students.kennesaw.edu)