nonmetric MDS

Download Report

Transcript nonmetric MDS

Nonmetric Multidimensional Scaling
•
input data are ranks: most similar pair AB<CB<AC most different pair, or input
data are rating scales: (very similar 1. . .9 very different), but we don't believe in
assuming interval level property of data.
•
This is actually astounding!! We begin with rank/ordinal-level data and result of
model is a set of distances or coordinates (both ratio-level).
Nonmetric Multidimensional Scaling
Some basics
• We will be attempting to estimate coordinates Xik's that result in model-fitted
distances to match our dissimilarity data as best as possible.
• for metric mds dij = linear function (δij)
• for nonmetric dij = monotonic function (δij)
a. monotonically increasing function (always increases or stays same, never goes
down):
actually -- 2 definitions of monotonicity
i.
strong monotonicity:
whenever δij < δkl then d^ij < d^kl
ii. weak monoticity:
whenever δij < δkl then d^ij  d^kl
We'll go with "b" (less restrictive, less demanding assumptions on data that could be errorful).
Nonmetric Multidimensional Scaling
b. input dissimilarities data {δij's} are immediately translated to ranks (thus input data
can certainly be ranks).
example:
--ranked like:
--.20
--1 --.53
.41
--3 2 ---
c. How to handle "ties" in the data (δij = δkl  ranks?)
example:
--ranked like:
--.20
--1
--.41
.41
--2?3? 2?3? --i. primary approach to ties:
if δij = δkl then d^ij may or may not equal d^kl
ii. secondary approach to ties:
^kl
if δij = δkl then d^ij  d
We'll go with i because again it's more flexible and less restrictive for data we know are likely errorful.
Nonmetric Multidimensional Scaling
3. Monotonic (or "isotonic") regression
3 values to watch for each pair of points (i & j):
*
*
*
data point = dissimilarity = δij , immediately translated to ranks.
distances = dij computed at each iteration from model
(estimate coordinates, compute dij's, etc.)
disparities = d^ij (in k & w notation)
values needed in the monotonic regression
These are close to dij values, but d^ij's are not distances
(i.e., they will not neccessarily satisfy axioms like triangle inequality).
Nonmetric Multidimensional Scaling
4. An example of what's done in monotonic regression
1
a. Begin with data
 ij s :
b. Translate to ranks:
2
3
4
5

1 -

2  .34 - 
.47 .39 - 
3



.92
.45
.50
4


5  .25 .56 .72 .23 - -
1
2
3
4
5

1-

2  3 -
 6 4 -
3



10
5
7
4


5  2 8 9 1 - -
c. We'll have dij's at each iteration (those are computed using formula for Euclidean
distance based on estimated coordinates Xik's at that iteration).
(The Xik's are estimated via algorithm of "steepest descent").
Nonmetric Multidimensional Scaling
d. Estimate disparities d^ij's
stimulus pair
ordered from max
to min δij (i,j)
δij
rank of
δij
dij estimated at
"current" iteration
5,4
.23
1
3
5,1
.25
2
6
2,1
.34
3
3*
3,2
.39
4
5
4,2
.45
5
8
3,1
.47
6
10
4,3
.50
7
13
5,2
.56
8
11*
5,3
.72
9
9*
4,1
.92
10
15
These are distances (so they satisfy our distance
axioms), but we want a model to obtain values
that are monotonically increasing for our
dissimilarities. Note there are deviations (i.e.,
decreases) at points denoted by "*".
dij's = 0
Rank
of δij
Nonmetric Multidimensional Scaling
dij's do not form monotonically increasing function, but
disparities dij's do:
d ij
3
6
3
5
8
10
13
11
9
15





dˆ ij
3
4.5
4.5
5
8
10
11
11
11
15
^
dij
Rank δij
Shepard Diagram
Model Assessment:
Kruskal’s “Stress”
badness-of-fit:

( d ij - dˆ ij )2
all pairs

2
d ij
all pairs