Max-margin Clustering Algorithm
Download
Report
Transcript Max-margin Clustering Algorithm
Max-margin Clustering: Detecting Margins from Projections of Points on Lines
1
Gopalan ,
2
Sankaranarayanan
Raghuraman
and Jagan
1Center for Automation Research, University of Maryland, College Park, MD USA; 2NEC Labs, Cupertino, CA USA
Multi-cluster Problem
Max-margin Clustering Algorithm
Location information of
projected points (SI)
alone is insufficient to
detect margins
Draw lines between all pairs of points
Estimate the probability of presence of margins between a
pair of points xi and xj by computing f(xi,xj)
Perform global clustering using f between all point-pairs
Results
The Role of Distance of Projection
Proposition 2
For line intervals in margin region,
perpendicular to the separating
hyperplane
min Dmin min i
Problem Statement
Given an unlabelled set of points forming k clusters, find a
grouping with maximum separating margin among the clusters
Int *
Prior work: (Mostly) Establish feedback between different
label proposals, and run a supervised classifier on it
Goal: To understand the relation between data points and
margin regions by analyzing projections of data on lines
Two-cluster Problem
Assumptions
Linearly separable clusters
Kernel trick for non-linear case
No outliers in data (max margin
exist only between clusters)
Enforce global cluster balance
Proposition 1
SI* exists ONLY on line segments in
margin region that are perpendicular
to the separating hyperplane
Such line segments directly provide
cluster groupings
Defn: Dmin of a line interval is the
minimum distance of projection
of points in that interval.
No outlier assumption: Max
margin between points within a
cluster M m min i
i
i
Proposition 3
For line intervals inside a cluster of
length more than Mm
max
D
M
/
2
min
m
CL
Int
Proposition 4
An interval with SI having no projected
points with distance of projection less
than Dmin*, [ SI ]Dmin* min i can lie
i
only outside a cluster; where
Dmin* min i
i
A Pair-wise Similarity Measure for Clustering
f ( xi , x j ) exp( max D[SI ]D )
D:Intij
f(xi,xj)=1, iff xi=xj
f(xi,xj)<<1, iff xi and xj are from different clusters, and Intij is
perpendicular to their separating hyperplane
Summary
Clustering
Detecting margin regions
Obtaining statistics of location and distance of projection of
points that are specific to line segments in margin regions
(Prop. 1 to 4)
A pair-wise similarity measure to perform clustering, which
avoids some optimization-related challenges prevalent in
most existing methods
References
1. F. De la Torre, and T. Kanade, “Discriminative cluster analysis”, ICML, pp. 241-248, 2006. ([8] in table)
2. K. Zhang, I.W. Tsang, and J.T. Kwok, “Maximum margin clustering made practical”, IEEE Trans.
Neural Networks, 20(4), pp. 583-596, 2009. ([31] in table)