2002_Spring_CS525_Intro

advertisement
CS525 DATA MINING
COURSE INTRODUCTION
YÜCEL SAYGIN
SABANCI UNIVERSITY
Contact Info

ysaygin@sabanciuniv.edu

http://people.sabanciuniv.edu/~ysaygin


Tel : 9576
No Specific office hours. You can drop by anytime
you like. Email or call me to make sure I am at the
office.
Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program
2
Course Info



Reference Book: Data Mining Concepts and
Techniques
Author: Jiawei Han and Micheline Kamber
Publisher: Morgan Kaufmann
Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program
3
Course Info

Grading:

Midterm : 30% (April 14-18)

Homework : 10%

Project : 30%

Paper presentation : 10%

Term Paper : 10%

Attendance during paper presentations: 10%
Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program
4
Topics that will be covered

Different Data Mining Techniques






Association Rules
Classification
Clustering
Data Mining and Security Issues
Applications of Data Mining
Data Warehousing
Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program
5
Aim of the course

Knowledge:


To introduce data mining concepts
Skills:


paper reading and presentation
research and/or project work
Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program
6
A Rough Schedule

March, April, First Week of May:



Lectures on various data mining techniques
Invited Speakers form Industry to share their
experiences
Remaining 4 weeks: Paper presentations and
discussions in class about research issues
Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program
7
What I will do

Give the basics on data mining





broad data mining concepts
research issues
Project supervision
Give directions and advise on the projects I
proposed (will be provided in the next slides)
Coordination of the presentations
Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program
8
What I expect you to do



I expect you to do things wrt your background
and expertise.
Students with CS background will do projects
involving implementation and/or research
Others can do application projects



On a real application
That will involve data collection, cleaning etc
With at least two data mining tools that will be compared in
terms of functionality for the chosen application
Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program
9
What I expect you to do





Understand the basic data mining concepts
Choose a specific area and two related papers on
the same topic for presentation in class
Attendance is required for paper presentations
and you will loose 2% of your overall for each
presentation you missed.
Write a term paper on the two papers presented.
Do a project and a final report describing what
you learned or achieved in the scope of the
project.
Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program
10
Projects


Data Mining and Game Theory. Will be co-supervised
with Ozgur Kibris from Economics (Mostly research,
and survey, may involve algorithms design. Good for
students in SLP)
Implementation of algorithms for data security
against data mining methods (pure algorithms
survey and implementation, good for CS students
who like implementation)
Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program
11
Projects


Development of algorithms for protecting sensitive
data against various data mining algorithms
(research and implementation, good for CS
students)
 Hiding Sequential patterns in temporal data by
changing time granularities is an example
Survey and Implementation of the existing Privacy
preserving data mining methods (pure
implementation, good for CS students)
Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program
12
Introduction to Data Mining



Why do we collect and process historical
data?
What is the purpose of data mining?
What are the applications?
Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program
13
Introduction to Data Mining


Data is mostly stored in data warehouses
Data Mining Techniques are used to analyse
the data:



Association rule finding from transactional data
Clustering of data with multiple dimensions
Classification of given data into predefined
classes
Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program
14
Download