Machine Learning - New Jersey Institute of Technology

Download Report

Transcript Machine Learning - New Jersey Institute of Technology

Machine Learning
Usman Roshan
Dept. of Computer Science
NJIT
What is Machine Learning?
• “Machine learning is programming computers
to optimize a performance criterion using
example data or past experience.” Intro to
Machine Learning, Alpaydin, 2010
• Examples:
– Facial recognition
– Digit recognition
– Molecular classification
A little history
• 1946: First computer called ENIAC to perform numerical
computations
• 1950: Alan Turing proposes the Turing test. Can machines
think?
• 1952: First game playing program for checkers by Arthur
Samuel at IBM. Knowledge based systems such as ELIZA
and MYCIN.
• 1957: Perceptron developed by Frank Roseblatt. Can be
combined to form a neural network.
• Early 1990’s: Statistical learning theory. Emphasize learning
from data instead of rule-based inference.
• Current status: Used widely in industry, combination of
various approaches but data-driven is prevalent.
Example up-close
• Problem: Recognize images representing digits
0 through 9
• Input: High dimensional vectors representing
images
• Output: 0 through 9 indicating the digit the
image represents
• Learning: Build a model from “training data”
• Predict “test data” with model
Data model
• We assume that the data is represented by a set of
vectors each of fixed dimensionality.
• Vector: a set of ordered numbers
• We may refer to each vector as a datapoint and each
dimension as a feature
• Example:
– A bank wishes to classify humans as risky or safe for loan
– Each human is a datapoint and represented by a vector
– Features may be age, income, mortage/rent, education,
family, current loans, and so on
Machine learning datasets
• NIPS 2003 feature selection contest
• mldata.org
• UCI machine learning repository
Machine Learning techniques we will
learn in the course
Bayesian classification
Univariate and multivariate
Linear regression
Maximum likelihood estimation
Naïve-Bayes
Feature selection
Dimensionality reduction
PCA
Clustering
Nearest neighbor
Decision trees and random forests
Linear discrimination
Logistic regression
Support vector machines
Kernel methods
Regularized risk minimization
Hidden Markov models
Graphical models
Perceptron and neural networks
In practice
• Combination of various methods
• Parameter tuning
– Error trade-off vs model complexity
• Data pre-processing
– Normalization
– Standardization
• Feature selection
– Discarding noisy features
Background
• Basic linear algebra and probability
– Vectors
– Dot products
– Eigenvector and eigenvalue
• See Appendix of textbook for probability
background
– Mean
– Variance
– Gaussian/Normal distribution
Assignments
• Implementation of basic classification
algorithms with Perl and Python
– Nearest Means
– Naïve Bayes
– K nearest neighbor
– Cross validation scripts
• Experiment with various algorithms on
assigned datasets
Project
• Experiment with NIPS 2003 feature selection
datasets
– Goal: achieve highest possible prediction accuracy
with scripts we will develop through the course
• Predict labels of given datasets with two
different classifiers
Exams
• One exam in the mid semester
• Final exam
• What to expect on the exams:
– Basic conceptual understanding of machine
learning techniques
– Be able to apply techniques to simple datasets
– Basic runtime and memory requirements
– Simple modifications
Grade breakdown
• Assignments and project worth 50%
• Exams worth 50%