Classification
Download
Report
Transcript Classification
Classification as data
mining tool
Done by
William Hellela
Rauf Gadar
Alex Prewett
Definition
Classification is the process of
dividing a dataset into mutually
exclusive groups so that the
members of each group are as
"close" as possible to one another,
and different groups are as "far" as
possible from one another, where
distance is measured with respect to
specific variable (s) we are trying to
predict.
Business Examples
A typical classification problem is to
divide a database of companies into
groups that are as homogeneous as
possible with respect to a
creditworthiness variable with values
"Good" and "Bad“
Predicting the response to direct
marketing campaign [will/ will not
respond]
Classification vs. Regression
What distinguishes classification from
regression is the type of output that
is predicted. Classification, as the
name implies, predicts class
membership.
For example, a classification model
predicts that a potential customer
will respond to an offer.
Classification vs. Regression
(cont)
However, regression model predicts a
specific value. For example, a model
predicts that a certain customer
profitability will be $854
Predictive vs. Descriptive
Models
Predictive: forecast explicit values,
based on patterns determined from
known results.
Descriptive: describe the patterns in
existing data, and are generally used
to create meaningful subgroups.
Data Mining Models
Classification VS Other tools
Classification is used to answer
questions form a finite set of classes
Estimation is used to answer
questions from an unknown,
continuous set of answers
Clustering also does not require a
finite set of predefined classes
Prediction is a task of learning a
pattern from examples using a
developed model.
Questions or comments
Any Questions ?