Multidimensional Scaling (MDS)

Download Report

Transcript Multidimensional Scaling (MDS)

國立雲林科技大學
National Yunlin University of Science and Technology
On multidimensional scaling and the
embedding of self-organizing maps
Presenter : Shu-Ya Li
Authors : Hujun Yin
NN, 2008
Intelligent Database Systems Lab
Outline

Motivation

Objective

Methodology Review





N.Y.U.S.T.
I. M.
SOM, ViSOM & MDS
On multidimensional scaling of SOM
PCA & Principal curve
Nonlinear principal manifold and SOM
On the embedding of SOM

Experiments and Results

Conclusion

Personal Comments
2
Intelligent Database Systems Lab
Motivation

SOM and ViSOM, have been known to yield similar
results to multidimensional scaling (MDS).


N.Y.U.S.T.
I. M.
However, the exact connection has not been established.
The SOM-based methods not only produce topological or
metric scaling but also provide a principal manifold.
3
Intelligent Database Systems Lab
Objectives
N.Y.U.S.T.
I. M.

This paper reveals the connection between the SOM (or its
variant ViSOM) and multidimensional scaling(MDS)
through analyzing their cost functions.

Their relationship with the principal manifold is also
discussed.
4
Intelligent Database Systems Lab
Basic SOM algorithm
N.Y.U.S.T.
I. M.
1. Initialize SOM
2. For each input data
2.1 Identify its Best Matching Unit (BMU)
2.2 Update BMU and its neighborhood
3. Repeat Step 2 till centroids don’t change much or a threshold is exceeded
4. Assign each data to its BMU and return the BMS and clusters
BMU
mi=[7, 5, 8]
Final projection
x1=[8, 5, 9]
x2=[7, 4, 2]
…
x1=[8, 5, 9]
training
Data
+
○
×
5
Intelligent Database Systems Lab
Visualization induced SOM (ViSOM)

ViSOM

To preserve distance/metric (locally) on the map

To extrapolate smoothly
preserve distance on the map
存在比例關係
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Multidimensional Scaling (MDS)
1. Assign points to arbitrary
coordinates in p-dimensional space.
N.Y.U.S.T.
I. M.
2. Compute euclidean distances among all
pairs of points, to form the Dhat matrix.
座標上點跟點
之間的距離
4. Adjust coordinates of each point
3. Compare the Dhat matrix with the input
D matrix by evaluating the stress function.
•MDS
1.Classical MDS
2.Metric MDS
Compare
投入原始距離(或相似)矩陣
3.Nonmetric MDS
投入距離(或相似)資料之順序等級
Dhat matrix
input Data matrix
Intelligent Database Systems Lab
On multidimensional scaling of SOM

SOM is a nonmetric MDS or ordinal scaling.

N.Y.U.S.T.
I. M.
真實資料距離
x1=[8, 5, 9]
x2=[7, 4, 2]
…
nonmetric MDS condition
Data

ViSOM is a distance-preserving, metric MDS.

The cost function of metric MDS

The cost function of the SOM can be expressed as
投射到MDS Map的距離
x2=[7, 4, 2]
m1=[8, 5, 8]
m2=[7, 5, 2]
x1=[8, 5, 9]
投射到SOM Map
SOM
ViSOM
Intelligent Database Systems Lab
PCA

PCA
N.Y.U.S.T.
I. M.
Reduction by PCA
Y1 = a11x1 + a12x2
Y1
Y2 = a21x1 + a22x2
X2
X2
Y2
[X1 X2][Y1]
X1
Reduction by x-axis projection:
[X1 X2][X1]
[1, 1] [1]
[1, 0.5][1]

X1
[X1 X2]  [Y1 Y2]
[1, 1]  [1.414, 0]
[1, 0.5]  [1.2, -0.3]
X2
Y1
Principal Curve/Surface
X1
Intelligent Database Systems Lab
Principal Curve/Surface

N.Y.U.S.T.
I. M.
Principal Curve/Surface

Projection index
X2

Self-consistent principal curve

Kernel smoother
Y1
X1
10
Intelligent Database Systems Lab
Nonlinear principal manifold and SOM
N.Y.U.S.T.
I. M.

ViSOM is a discrete principal manifold, and it is also a MDS.

In the SOM, data are projected onto the nodes rather than onto the
curve/surface.

The smoothing process in the SOM and ViSOM, as a convergence criterion

The smoothing process in the ViSOM resembles that of the principal curve
as shown below

The MDS and principal manifold perform the same underlying task at least
in the context of data visualization and dimension reduction.
11
Intelligent Database Systems Lab
On the embedding of SOM

N.Y.U.S.T.
I. M.
growing ViSOM

Start with a small initial map, say M0 × M0. (5*5)

Update the weights of the neurons of the neighborhood using the ViSOM principle

Grow the map by adding a column or row to the side with the highest activities

Refresh the map (neurons) probabilistically

Check if the map has converged

Project the data samples onto the map, either to the neurons or by the LLP resolution enhancement
method
Intelligent Database Systems Lab
Experiments
N.Y.U.S.T.
I. M.
13
Intelligent Database Systems Lab
Conclusion

N.Y.U.S.T.
I. M.
This paper reveals the connection between the SOM,
ViSOM and MDS through analyzing their cost functions.

SOMs and MDS are similar mappings for the principle of data
visualization.

The ViSOM is closer to MDS than SOM.

SOM is a useful tool for data clustering, relational
visualisation (nonmetric scaling) and management.

ViSOM is particularly suited for direct visualisation and is
a metric preserving nonlinear manifold.

gViSOM is an effective algorithm.
14
Intelligent Database Systems Lab
Personal Comments

Advantage


…
Drawback


N.Y.U.S.T.
I. M.
如果圖多一點就好了
Application

…
15
Intelligent Database Systems Lab