COMP 4332 / RMBI 4330 Big Data Mining (Spring 2016) Lei Chen Hong Kong University of Science and Technology leichen@cse.ust.hk http://www.cse.ust.hk/~leichen Topics • Review of Basics • Practical Data Mining – Imbalanced Data – Text and Web Mining – Big Data – Social Recommendation – Social Media and Social Networks • Hands on: 2 Major Projects • Student Presentations 2016/3/18 Course Introduction 2 Outcome and Objective • Student will know the current state of the art in Data Mining • Student will be able to implement a practical data mining project • Student will be able to present their ideas well • Prepared for PG study, Internship, etc. 2016/3/18 Course Introduction 3 Projects: based on KDDCUPs • Project 1: – KDDCUPs on predicate a funding request deserve A+ (KDDCUP 2014) • April 5th, 2016 • Project 2: – Predicting dropouts in MOOC (KDDCUP 2015) • May 10th, 2016 2016/3/18 Course Introduction 4 KDDCUP Examples — KDDCUP from past years — 2007: — In general, we wish to — Input: Data — Predict if a user is going to rate a movie? — Predict how many users are going to rate a movie? — 2006: — Output: — Build model — Apply model to future data — Predict if a patient has cancer from medical images — 2005: — Given a web query (“Apple”), predict the categories (IT, Food) — 1998: — Given a person, predict if this person is going to donate money 2016/3/18 Course Introduction 5 5 Important Sites Course Web Site http://www.cse.ust.hk/~leichen/comp4332 TA: Yue Wang and Konstantinos Giannakopoulos Assignment Hand-in: CASS 2016/3/18 Course Introduction 6 Prerequisites Statistics and Probability would help, But will be reviewed in class Machine Learning/Pattern Recognition would help, We will review some most important algorithms One programming language We will teach new languages in the tutorial 2016/3/18 Course Introduction 7 Grading Midterm Exam: 20% Course Projects: 60% Presentations: 10% Term Paper: 10% 2016/3/18 Course Introduction 8 More info • Textbooks: – Listed on Course Website – Buy them online if you wish 2016/3/18 Course Introduction 9