FALL 2015 - Department of Computer Science

advertisement

CSC 7442 – Data Mining and Knowledge Discovery from Datasets

FALL 2015

Louisiana State University

School of Electrical Engineering and Computer Science

Division of Computer Science and Engineering

Instructor: Professor Evangelos Triantaphyllou

Office: 341 Hatcher Hall, Phone: 578-1348, E-mail:

Webpage: http://www.csc.lsu.edu/trianta trianta (at) csc.lsu.edu

Classroom: 2228 Patrick Taylor Hall

Meeting Times: Tuesday & Thursday: from 3:00 to 4:20 PM

Office hours: Tuesday & Thursday from 10:00 to 11:30 AM or by appointment

TA: Dipak Singh, email: dsingh8 (at) lsu.edu

Textbook: Introduction to Data Mining , by Tan, Steinbacn and Kumar, published by Addison Wesley, 1 st edition and handouts (PDF files and/or URLs).

Prerequisites:

Knowledge of a general purpose programming language, elements of graphs, and Boolean algebra.

GOAL :

The main goal of this course is to introduce the students to the fundamental issues of the fast emerging field of data mining and knowledge discovery from databases. Emphasis is given on the principles that govern this new area of computer science.

MAIN TOPICS:

1. Introduction

- Introduction to Data Mining and Knowledge Discovery in Databases

-

-

The concept of data mining and knowledge discovery in databases

The data mining tasks

- Relationships of DM/KDD with other research fields: statistics, machine learning

2. The Data

-

-

Data pre-processing: data cleaning, scaling and transformations

Dimensionality reduction, feature selection

3. Cluster Analysis

- K-means and hierarchical clustering

4. Other Classification Techniques

- Decision trees

- Support vector machines

-

-

Nearest neighbor classifier, Bayesian classifier and ANN

Model overfitting

5. Boolean functions based Classification Techniques

6. Guided Learning

7. Association Analysis

8. Model Evaluation

9. Some Special Topics on Data Mining and Knowledge Discovery in Databases

1

GRADING: Homework assignments

Pop up quizzes (3-5)

1 Midterm Exam

1 Final Exam

Total points

COMMENTS :

[1] : The Grading Scale:

: 25%

: 5%

: 30%

: 40% ; this is comprehensive and closed books/notes.

: 100%

SCORE

100, 99, 98

97, 96, 95, 94, 93

92, 91, 90

89, 88, 87

86, 85, 84, 83

82, 81, 80

79, 78, 77

76, 75, 74, 73

72, 71, 70

69, 68, 67

66, 65, 64, 63

62, 61, 60

59, or less

GRADE

A+

A

A-

B+

B

B-

C+

C

C-

D+

D

D-

F

An F grade (Fail) will be earned for 59 or fewer points or if academic misconduct is detected (please also see comment #3, below). Note that a D grade is not a passing grade for graduate students.

[2] :

Class attendance and active participation are strongly encouraged (but not considered in the grade).

[3]:

No cheating or plagiarism will be tolerated.

A student caught cheating or plagiarizing will be given an F grade immediately as the final grade for the (entire) course and in addition any such case will be immediately reported to the Office of the Dean of Students at LSU.

LSU’s Honor Code governs all the work in this course (please check all the materials at: http://saa.lsu.edu/code ). Unless indicated otherwise, all written work that is handed in must be done only by the individual whose name appears on the document. Regarding student cooperation on the homework assignments, you are encouraged to discuss with other students the homework problems; however, what you submit must be your own answers. Your instructor and teaching assistant are authorized to give you help on all work (help will not be given if it provides an unfair advantage). Students are strongly encouraged to carefully study the contents of LSU’s Honor Code on plagiarism and other issues of academic misconduct. Thus, you are very strongly advised to visit the website of LSU’s Office of

Student Advocacy and Accountability at: http://saa.lsu.edu/

For specific information on LSU’s definition of what constitutes academic misconduct , please visit the following website: https://saa.lsu.edu/code-10_2-academic-misconduct

For some other related concepts of what constitutes plagiarism, please visit the following website: http://homeworktips.about.com/od/citationsandbibliography/a/What-Is-Plagiarism.htm

2

[4] :

No food and/or drinking are allowed in the classroom. No use of cell phones is allowed (unless there is a special situation explained prior to a meeting). During the exams (midterm and final) only very low capability (i.e., ones with simple/basic arithmetic operations) calculators may be used. No other devices are allowed.

Some instructions in preparing HWs and Exams

1) Make sure that you understand a given problem before you attempt to answer it.

2)

3)

Try your best in solving all problems in a given HW or exam by YOURSELF.

Pay extra attention in the organization of your answers. Solutions should not be mixed up with scratch material in deriving them.

4)

5)

Present and staple your solutions sequentially.

Make sure that your English is correct, clear and easy to read. Although this is not a composition course, it is very important that you present your solutions in a highly professional and scientific manner.

6) If you get confused, do not worry. Just relax, organize your thoughts, and try to see the problem from a simple, but still accurate, point of view.

7) You should have no doubt that you (assuming you have studied adequately) are capable of answering all the problems in the HWs and exams!

7) New policy for “wireless communication devices and hand held computers”: All such devices must be turned off in all class meetings, including examinations, and kept out of your reach at all time. During exams only the most elementary calculators

(with the basic arithmetic functions) are allowed. No advanced calculators will be allowed. It is your responsibility to come with a simple calculator for the exams.

A hint...

Next is the first question that is very likely to be given to you in the mid term and final exams and the correct solution. It is usually assigned 5 points.

**********************************************************************

PROBLEM 1: (5 points)

What is the single most important step in solving any science / engineering problem?

Answer: To define the problem correctly and enjoy the solution process.

(You must underline the right words to earn all 5 points.)

3

Download