Fundamentals of Data Mining Fall 2014 CPTS 483 & 580, Tuesday and Thursday, Noon-1:15pm Instructor: John Miller PhD Office: West 134E WSU Tri-Cities jhmiller@tricity.wsu.edu Class web page can be found at http://www.tricity.wsu.edu/~jhmiller Required Textbook: Learning from Data By: Abu-Mostafa, Magdom-Ismail and Lin Website: AMLbook.com Contains slides and video of course at Caltech Short text (200 pages) Central theme: Can data-mining results be trusted? Caltech course goes beyond text Probably also true for this class Grades: Assignments, quizzes, and final exam have equal weight of 1/3 Graduate credit requires a project approved by instructor Tests given in class with open books and lecture notes Quizzes designed to reward reading text and lecture notes Final exam will contain problems like those worked in class Assignments will require programing See me if your can’t make their due date Graduate project reports 5-10 pages double spaced with figures Due last class period before exam week Objectives of the class: 1. Lean basic methods of data mining 2. Lean basic principles that ensure quality More nuts and bolts Accommodations for Disabled Students: Reasonable accommodations are available for students who have a documented disability. If you have a documented disability, even temporary, make an appointment as soon as possible with the Disability Services Coordinator, Cherish Tijerina, 372-7352, ctijerina@tricity.wsu.edu You will need to provide your instructor with the appropriate classroom accommodation form. The forms should be completed and submitted during the first week of class. Late notification may delay your accommodations. All accommodations for disabilities must be approved through Disability Services. Classroom accommodation forms are available through the Disability Services Office. More nuts and bolts Academic Integrity: As stated in the WSU Tri-Cities Student Handbook," any member of the University community who witnesses an apparent act of academic dishonesty shall report the act either to the instructor responsible for the course or activity or to the Office of Student Affairs." The Handbook defines academic dishonesty to include "cheating, falsification, fabrication, multiple submission [e.g., submitting the same or slightly revised paper or oral report to different courses as a new piece of work], plagiarism, abuse of academic material. complicity, or misconduct in research." Infractions will be addressed according to procedures specified in the Handbook. More nuts and bolts Safety: Should there be a need to evacuate the building (e.g., fire alarm or some other critical event), students should meet the instructor at the Cougar statue directly outside of the West building. A more comprehensive explanation of the campus safety plan is available at http://www.tricity.wsu.edu/safetyplan/ The university emergency management plan is available at http://oem.wsu.edu/emergencies/ Further, an alert system is available. You can sign up for emergency alerts (see http://alert.wsu.edu) through the zzusis site (http://portal.wsu.edu/). Student Concerns. If you have any student concerns, you can contact Carol Wilkerson the Director of Student Affairs in West 269F, (509) 372-7139, or carol.wilkerson@tricity.wsu.edu. If you have any concerns about this class, you should contact your instructor first, if possible. Attendance Policy. Absences should be avoided. Students should contact an instructor if an absence from class is unavoidable. Students are encouraged to read Section 73 (Absences) of the Washington State University Academic Regulations, which is found in the WSU Tri-Cities Student Handbook. Tentative Schedule Tu Aug 19 Th Aug 21 Tu Aug 26 Th Aug 28 Tu Sep 2 Th Sep 4 Tu Sep 9 Th Sep 11 Tu Sep 16 Th Sep 18 Tu Sep 23 Th Sep 25 Tu Sep 30 Th Oct 2 Tu Oct 7 Th Oct 9 Tu Oct 14 Th Oct 16 Tu Oct 21 Th Oct 23 Tu Oct 28 Th Oct 30 Tu Nov 4 Th Nov 6 Tu Nov 11 Th Nov 13 Tu Nov 18 Th Nov 20 Tu Nov 25 Th Nov 27 Dec 2,4 Dec 8-12 Discussion of class syllabus Introduction to supervised machine learning Introduction to supervised machine learning Introduction to Bayesian statistics Introduction to Bayesian statistics Parametric methods Parametric methods Multivariate Data Multivariate Data Test #1 Artificial Neural Networks Artificial Neural Networks Artificial Neural Networks Artificial Neural Networks Artificial Neural Networks Genetic Algorithm Genetic Algorithm Radial basis functions Radial basis functions Test #2 Self-organizing maps Self-organizing maps. Advanced applications Advanced applications Advanced applications Support Vector machines Support Vector machines Support Vector machines Thanksgiving break Thanksgiving break Review Finals week Test #3