LSSG Black Belt Training
Download
Report
Transcript LSSG Black Belt Training
Data Mining
Cluster Analysis Output
1
K-Means Clustering
2-Cluster Solution
Final Cluster Centers
Cluster
1
Miles per Gallon
Engine Displacement
(cu. inches)
Horsepower
Vehicle Weig ht (lbs.)
Time to Accelerate
from 0 to 60 mph (sec)
2
28
16
121
302
82
2368
138
3864
16
15
Number of Cases in each Cluster
Are the assumptions of Cluster Analysis
satisfied?
Cluster
Valid
Missing
Interpret the two clusters
(even if assumptions violated).
Clustering
1
2
235.000
157.000
392.000
14.000
2
3-Cluster Solution
Final Cluster Centers
15
Cluster
2
30
339
103
201
155
4189
76
2172
102
3074
14
16
16
1
Miles per Gallon
Engine Displacement
(cu. inches)
Horsepower
Vehicle Weight (lbs.)
Time to Accelerate
from 0 to 60 mph (sec)
3
22
Number of Cases in each Cluster
Cluster
Interpret the 3-cluster solution.
Is it better than the 2-cluster solution?
Valid
Missing
Clustering
1
2
3
96.000
165.000
131.000
392.000
14.000
3
4- Cluster solution
Final Cluster Centers
Cluster
1
Miles per Gallon
Engine Displacement
(cu. inches)
Horsepower
Vehicle Weig ht (lbs.)
Time to Accelerate
from 0 to 60 mph (sec)
2
3
4
9
30
20
15
4
108
221
346
93
732
78
2241
107
3208
159
4255
9
16
16
13
Number of Cases in each Cluster
How does the 4-cluster solution compare to
the previous two?
Look at the number of cases per cluster. What
problem with the data does that indicate?
Clustering
Cluster
Valid
Missing
1
2
3
4
1.000
187.000
119.000
85.000
392.000
14.000
4
Revised 4-cluster solution
Final Cluster Centers
Cluster
1
Miles per Gallon
Engine Displacement
(cu. inches)
Horsepower
Vehicle Weig ht (lbs.)
Time to Accelerate
from 0 to 60 mph (sec)
2
3
4
24
31
18
14
161
99
266
359
96
2786
73
2115
118
3539
166
4387
16
17
16
13
Number of Cases in each Cluster
Interpret the new 4-cluster solution
with outlier removed.
Which of the 3 solutions (2, 3, or 4 clusters)
do you consider the most meaningful?
Clustering
Cluster
Valid
Missing
1
2
3
4
103.000
139.000
85.000
64.000
391.000
14.000
5