Data Mining in Forecasting
Download
Report
Transcript Data Mining in Forecasting
Chapter 9
DATA MINING
PAULA JENSEN
SDSM&T
ENGM 745
McGraw-Hill/Irwin
Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.
DATA MINING
DATA-DATA
Extracting
of useful information
from large databases
Tools of Data Mining
Looking at where to find the data
TOOLS OF DATA MINING
Prediction
Classification
Clustering
Association
PREDICTION
Predict the value of a numeric variable
Customer’s expenditure
Will they purchase
What are their interests
Do their interests predict a purchase
CLASSIFICATION
Classes of objects or actions
Reliability of customer
Income
Location
CLUSTERING
Analysis tools analyze objects viewed as a class
Where is the cut off of income or size
How do I group the information
ASSOCIATION
Patterns based on likes
Netflix
Facebook
Google
CLASSIFICATION
k-nearest
neighbor
Naïve Bayes
Classification/regression trees
Logistic Regression
DATA MINING TERMINOLOGY
9-10
9-11
9-12
K-NEAREST NEIGHBOR
Use Subset of total data called training data
Select closest neighbor with Euclidian distance
shown in previous slide other metrics available to
measure to define neighbors
Validation data is a separate set of data
Test statistic important on the validation data
versus the training data
60% of data training data and 40% validation
data acceptable mix
9-14
9-15
K-NEAREST NEIGHBOR ANALYSIS
Multidimentional
Program is going to compute a distance
associated to each attribute
Continuous Variables are measured in different
scales
Categorical attributes will use a weighted
mechanism
Example is will they respond to marketing to
take a loan
K=3 means used 3 neighbors to classify all records
9-17
Type 1 would take a loan – Type 0 would not take a loan
9-18
9-19
TERMS
Lift – measures the change in concentration of a
particular class when the model is used to select
a group from the general population. Significant
lift on the example.
Decile Wise chart- Pick the top 10% of our
records classified by our model our selection
would include approximately 7 times as many
correct classifications.
Classification Trees
9-21
9-22
9-23
CLASSIFICATION TREES
Advantages
Decision rules are easy
Easy to understand
Disadvantages
Overfit data
Correlated attributes will cause multicollinearity
9-25
9-26
9-27
9-28
NAÏVE BAYSES
Statistical Classification
Bayes Therom: predicts the probability of a prior
event given a certain subsequent event has taken
place
Called Naïve because each attribute is assumed
as independent
9-30
9-31
9-32
9-33
BAYESIAN THEOREM
P (A|B) = (P(B|A))* P(A)
P(B)
P(A) is the prior probability
P (A|B) is conditional probability of A, given B
P (B|A) is the conditional probability of B given A
P (B) is the prior probability of B
9-35
APPLYING BAYES’ THEROM
REGRESSION
Logistic regression or Logit analysis
Difference between logics regression and
ordinary regression is that the dependent
variable in logistic regression is categorical not
continuous
Dependent Variable is Dichtomous- either yes or
no
Dependent variable is either will be limited to
values between 0 and 1
9-38
9-39
9-40
9-41
9-42
9-43
9-44
9-45
9-46
WHERE DO I FIND THE DATA???
Current Customer Activity
Collect in your database
Family names
Sales software
Forms from your website Wufoo.com
Track inquiries
Current Facebook Activity
BUY IT!
Mailing lists
How to use it???