Introduction - Hong Kong University of Science and Technology

Download Report

Transcript Introduction - Hong Kong University of Science and Technology

COMP 4332 / RMBI 4330
Big Data Mining (Spring 2016)
Lei Chen
Hong Kong University of Science and Technology
[email protected]
http://www.cse.ust.hk/~leichen
Topics
• Review of Basics
• Practical Data Mining
– Imbalanced Data
– Text and Web Mining
– Big Data
– Social Recommendation
– Social Media and Social Networks
• Hands on: 2 Major Projects
• Student Presentations
2016/3/26
Course Introduction
2
Outcome and Objective
• Student will know the current state of
the art in Data Mining
• Student will be able to implement a
practical data mining project
• Student will be able to present their
ideas well
• Prepared for PG study, Internship, etc.
2016/3/26
Course Introduction
3
Projects: based on KDDCUPs
• Project 1:
– KDDCUPs on predicate a funding request
deserve A+ (KDDCUP 2014)
• April 5th, 2016
• Project 2:
– Predicting dropouts in MOOC (KDDCUP 2015)
• May 10th, 2016
2016/3/26
Course Introduction
4
KDDCUP Examples
— KDDCUP from past years
— 2007:
— In general, we wish to
— Input: Data
— Predict if a user is going to rate a movie?
— Predict how many users are going to rate a
movie?
— 2006:
— Output:
— Build model
— Apply model to future data
— Predict if a patient has cancer from
medical images
— 2005:
— Given a web query (“Apple”), predict
the categories (IT, Food)
— 1998:
— Given a person, predict if this person
is going to donate money
2016/3/26
Course Introduction
5
5
Important Sites
 Course Web Site
 http://www.cse.ust.hk/~leichen/comp4332
 TA: Yue Wang and Konstantinos
Giannakopoulos
 Assignment Hand-in: CASS
2016/3/26
Course Introduction
6
Prerequisites
 Statistics and Probability would help,
 But will be reviewed in class
 Machine Learning/Pattern Recognition would
help,
 We will review some most important algorithms
 One programming language
 We will teach new languages in the tutorial
2016/3/26
Course Introduction
7
Grading




Midterm Exam: 20%
Course Projects: 60%
Presentations: 10%
Term Paper: 10%
2016/3/26
Course Introduction
8
More info
• Textbooks:
– Listed on Course Website
– Buy them online if you wish
2016/3/26
Course Introduction
9