MIS 451 Building Business Intelligence Systems
Download
Report
Transcript MIS 451 Building Business Intelligence Systems
MIS 451
Building Business Intelligence Systems
Classification (2)
Data
2
Classification
Divide and Conquer
Pick an attribute to divide the data set with the
most entropy reduction
Stop until no attribute to pick or data in all leaf
nodes are pure (I.e. belong to one class)
3
Classification
Step 1: there are four attributes to pick:
student, income, age, and credit rating
E(BD) = 0.940
E(D|student) = 0.789
E(D|age) = 0.694
E(D|income) = 0.911
E(D|credit) = 0.892
4
Classification
Step 2: Divide the original data set by age into
subset1 (<=30), subset 2 (31:40) and subset3 (>40)
Step 3-1: For subset 1, there are three attributes to
pick: income, student, and credit
E(BD) = 1.17
E(D|student) = 0
E(D|income) = ??
E(D|credit) = ??
5
Classification
Step 3-2: Divide subset 1 by student into subset1-1
(yes) and subset 1-2 (no)
Step 4-1: For subset 3, there are three attributes to
pick: income, student, and credit
E(BD) = 1.17
E(D|credit) = 0
E(D|income) = ??
E(D|student) = ??
Step 4-2: Divide subset 3 by credit into subset3-1
(fair) and subset 3-2 (excellent)
6
Extract rules from the model
Each path from the root to a leaf node forms
a IF-THEN rule.
In this rule, root and internal nodes are
conjuncted to form the IF part.
Left node denotes the THEN part of the rule.
7
Reading: Data Mining book pp279-291
8