10 points - NDSU Computer Science

advertisement
Syllabus
CS479(7118) / 679(7112): Introduction to Data Mining Spring-2008
william.perrizo@ndsu.edu course web site: http://www.cs.ndsu.nodak.edu/~perrizo/classes/#0
Text Data Mining Han and Kamber, 2nd edition.
Office Hours: MWF 11-11:50, in IACC 258 A15
(others by appointment)
Please use email for questions that can be emailed. If you have a question that cannot
be adequately stated or answered by email, please use the office hours. I need to
ask that you please not come in to office hours if you have a cold or flu or another
infection (until it is non-infectuous). Thank you so much.
All assignments and your term paper must be SUBMITED THROUGH
BLACKBOARD. (DO NOT email to william.perrizo@ndsu.edu as previously
instructed). All records will be kept on the Blackboard system and will be
available to you from there.
When submitting your assignments and term paper through BLACKBOARD, please
identify your work by using your first_name.last_name just as it appears in your
NDSU email address (e.g., mine is william.perrizo).
Lectures and Lecture notes are available from, http://www.cs.ndsu.nodak.edu/~perrizo/classes/#0,
and also from the BLACKBOARD. Other additional materials are available on the website also.
COURSE DESCRIPTION
Topics: Introduction to Data (data processing, data warehousing and data cubes); and DataMining
(association rule mining, classification or prediction and clustering).
COURSE OBJECTIVES: Understand the fundamentals of data mining. Gain experience in data mining
research and in the written reporting of it.
TERM PAPER (100 points): Each student will pick an application area (or focus area) in which to
concentrate. Your focus area can be an area of application of data mining such as Bioinformatics,
Medical Computer Aided Detection, etc. (Chapters 8, 9, 10 and 11 are rich with suggested application
areas. Read those chapters to get help choosing a focus area. Your term paper and your assignment
solutions will be directed toward your focus area.). You should choose your focus area very early (first
week!). You set or change your focus area any time by emailing your choice to me. Each student will
have a unique focus area (first come, first serve via my email queue). Changing focus areas is even
encouraged - as it will give you a chance to learn about more than one focus area. Note that your term
paper should be in the focus area you end up with. Each assignment solution must describe relevance
to your posted focus area at the due date of that assignment (see below). Your term paper will be a
topic from that focus area (some example topics and focus areas in html are at Possible Topics and in
powerpoint at Possible Topics ) or your own RESEARCH topic - but it must be a new RESEARCH idea
of yours, NOT A PAPER written by someone else or a paper written for another course or for
conference our journal publication). Included in the Possible Topics files is a complete set of
guidelines on what to include in your paper and what format to use. Note that the guidelines are also
available from the Blackboard system. Research the topic, write a quality (publishable in archival
media?) paper. Topics will to be approved 1st-Come-1st-Serve (email title and abstract to
william.perrizo@ndsu.edu). Papers are graded on contribution, level of current research interest, depth,
correctness, clarity, and insight. 679 students, as graduate students, will be expected to achieve a higher
level of true research on their paper.
Assignments (70 points): Each chapter has > 10 exercises in the back. Please choose any 10 to solve and
upload your solution to blackboard by the due data (see next slide). Please also make all of your
solutions relevant to your chosen FOCUS AREA. Every solution should answer the question, "How
does this apply to my FOCUS AREA) specifically. Changing focus areas is even encouraged - as it will
give you a chance to learn about more than one focus area. Each assignment solution must describe
relevance to your posted focus area as of the due date of that assignment.
COURSE Assignments:
Course website: http://www.cs.ndsu.nodak.edu/~perrizo/classes/#0
Assignment 1
is due
January
18
5PM
(10 chapter 1 exercises)
(10 points)
Assignment 2
is due
February
1
5PM
(10 chapter 2 exercises)
(10 points)
Assignment 3
is due
February
15
5PM
(10 chapter 3 exercises)
(10 points)
Assignment 4
is due
February
29
5PM
(10 chapter 4 exercises)
(10 points)
Assignment 5
is due
March
14
5PM
(10 chapter 5 exercises)
(10 points)
Assignment 6
is due
March
28
5PM
(10 chapter 6 exercises)
(10 points)
Assignment 7
is due
April
11
5PM
(10 chapter 7 exercises)
(10 points)
9
5PM
The Term Paper is due May
Final Exam will be an oral exam over your paper and chapters 5,6,7
Grades will be based on a grade curve of your total points out of
(100 points)
(70 points)
240 points
On all assignments, you must work alone. Please do not share your work with anyone or
be shared with by anyone else. Submit assignments and paper through BLACKBOARD.
You an schedule your final exam with me for any 20 minute period between 11 and 11:50
MWF, by emailing to me, your choice (first come, first serve via my email queue). You
must schedule your exam by March 14), but you can schedule it (and take it, if you
choose) any time before that too.
COURSE DESCRIPTION continued
REQUIRED MATERIALS: The text, email, WWW access are required.
STUDENTS NEEDING SPECIAL ACCOMMODATIONS or who have special
needs are invited to share that information with the instructor.
PREREQUISITES: CS366 or equiv. Student must be able to read and follow
technical, detailed instructions and adapt solutions.
ACADEMIC HONESTY: Work must be completed in a manner consistent with
NDSU Senate Policy 335: Code of Academic Responsibility and Conduct.
The goals of this course include to initiate student's into data and data mining
systems research and to enhance student's written presentation skills.
Additional reference material on all topics in this course can be found on the web by
doing a Google (or Yahoo or Ask) search on the appropriate keyword(s) and also
by using the NDSU library.
Good luck in your 479/679 course!
Focus Areas and Term Paper Titles chosen so far
Date
Name
Focus Area
jan 8
jan 10
jan 13
jan 15
jan 18
jan 18
jan 18
jan 19
jan 20
jan 23
Jason Stone
Ken Brown
Basudha Pradhan
Karl Gunderson
Krishnakanth Ireddynaga
Jianfei Wu
Chaitanya Dumpala
Loai Al-Nimer
Samuel Kondamarri
Dibakar Bhowmick
Automatic Alerters in S. E.
Financial Data
Loan Payment Prediction/Classification for Customer Credit Policy Analysis
Transactional Data
Two products interaction (co-occurrence at checkout x) to maximize sales
pattern representation, comparison and analysis
DATA MINING THE WORLD WIDE WEB
Stock Data
Pattern Recognition, Classification and performance analysis Using Markov modelling Techniques
Inter Entity Correlation
Correlating protein domains and bacterial properties for entire bacterial genomes"
Software Engineering
Music and Musical instrument data analysis
Term Paper Title
Download