Class Distributions

Transcript Class Distributions

國立雲林科技大學
National Yunlin University of Science and Technology
Class distributions on SOM surfaces
for feature extraction and
object retrieval
Advisor : Dr. Hsu
Graduate : Kuo-min Wang
Authors
: Jorma T. Laaksonen*,
J. Markus Koskela,
Erkki Oja
2005 Expert Systems with Applications
1
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Outline










Motivation
Objective
Introduction
Class Distributions
BMU Probabilities
BMU Entropy
SOM Surface Convolutions
Multiple feature extraction
Bayesian Decision estimation
Personal Opinion
2
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Motivate


A Self-Organizing Map (SOM) is typically
trained in unsupervised mode, using a large
batch of training data.
Even from the same data, qualitatively
different distributions can be obtained by using
different feature extraction techniques
3
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Objective


We use such distributions for comparing different
classes and different feature representations of the
data in our content-based image retrieval system
PicSOM.
The information-theoretic measures of entropy and
mutual information are suggested to evaluate the
compactness of a distribution and the independence
of two distributions.
4
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Introduction

影像檢索（Image Retrieval）


Segmentation


segmentation, feature extraction, representation 及 query
processing
將影像中不同的區域劃分出來，大多是時候是指者將影像中物件的
邊緣找出來，然後再確定這個區域是否是有意義的區域
Feature extraction



指一張影向上某一塊區域的特徵。
特徵的擷取跟特徵的表示方式
（Representation）有直接的關係，
因為不同的表示法，就會需要不同
的擷取法
顏色(color)、形狀(shape)、質地(texture)
5
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Introduction (cont.)

We study how object class histograms on SOMs can
be given interpretations in terms of probability
densities and information-theoretic measures


Entropy and mutual information (Cover & Thomas, 1991)
A good feature


the class is heavily concentrated on only a few nearby map
elements, giving a low value of entropy.
The mutual information of two features’ distributions is a
measure on how independent those features are.
6
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Class Distributions

Normalized to unit sum


the hit frequency give a discrete histogram which is a sample
estimate of a probability distribution of the class on the SOM surface.
The shape of the distribution depends on several
factors

The distribution of the original data


Cannot to control the very-high-dimensional pattern space
Feature extraction technique in use affects the metrics and the
distribution of all the generated feature vectors



Feature invariance
Some pattern space directions are retained better than others.
Working properly, semantically similar patterns will be mapped nearer to
each other
7
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Class Distributions (cont.)

Overall shape of the training set


After it has been mapped from the original data space to the
feature vector space, determines the overall organization of the
SOM.
The class distribution of the studied object subset or
class, relative to the overall shape of the feature vector
distribution.
8
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Class Distributions (cont.)

Measures the denseness or locality of feature vectors on
a SOM

SDH (Pampalk, Rauber, & Merkl, 2002)


Each data point is mapped not only to its nearest SOM unit but to s
nearest units with reciprocally decreasing fractions.
Quantitative locality measures



Map usage, average pair distance, fragmentation,
and purity (Pullwitt , 2002)
They fail to take into account the topological structure of the class
9
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
BMU Probabilities


Calculating the a priori probability of each SOM unit
for being the BMU for any vector x of the feature
space.
Probability density function (pdf)
10
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
BMU Probabilities (cont.)

Voronoi region

The set of vectors in the original feature space that the closer to
the weight vector of unit i than to any other weight vector

We are actually replacing the continuous pdf with a discrete
probability histogram by counting the number of times that any
given map unit is the BMU.
The probability histogram of class C on the SOM surface

11
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
BMU Entropy

The entropy H of a distribution P=(P0,P1,…Pk-1) is
calculated as
s
4
s
s
4
2
s
1
2
2
1
s
3
12
Intelligent Database Systems Lab
3
N.Y.U.S.T.
I. M.
BMU Entropy (cont.)
13
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
SOM Surface Convolutions

Entropy Drawback



The calculation of entropies does not yet take into account the
spatial topology of the SOM units in any way.
It is the topological order of the units that separates SOM from
other vector quantization methods.
That method bears similarity to the smoothed data
histogram approach (Pampalk, 2002)

data points are not mapped one-to-one to their BMUs but spread
into s closet map units in the feature space.
14
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
SOM Surface Convolutions (cont.)
15
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
SOM Surface Convolutions (cont.)


The larger the convolution window is , the smoother
is the overall shape of the distribution due to the
vanishing of the details.
The selection of a proper size for the convolution
mask can be identified as a form of the general
scale-space problem
16
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
SOM Surface Convolutions (cont.)
17
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Multiple Feature Extractions




It is possible to use more than one feature extraction method in
parallel.
In CBIR, three different feature categories are generally
recognized: color, texture, and shape features.
Let us denote by P=(P0, P1, …, Pk-1) and Q=(Q0, Q1,…, Qk-1)
H(P) and H(Q) measure the distributions of the single feature
vectors, mutual information I(P, Q) can be used for studying
the interplay between them
18
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Multiple Feature Extractions (cont.)



HT and nHT have by far the largest values for mutual
information
CS and SC have the largest value on both SOMs
EH and HT is high on the smaller SOM, but not so
much on the larger SOM with more resolution.
19
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Bayesian Decision Estimation

Using the Bayesian decision rule to make optimal
classification
Posterior probability

To decide on the jth object’s membership in class C

20
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Bayesian Decision Estimation (cont.)

Query by example


Relevance feedback


Presents a number of images to the user at each query round, and
the user is expected to evaluate their relevance to her current task.
Incrementally fine-tune the selection so that more and more
relevant images will be shown at consequtive query rounds
Choose next image for the user


Maximal probability of relevance
Minimal probability of nonrelevance
21
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Bayesian Decision Estimation (cont.)
22
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Bayesian Decision Estimation (cont.)

Relevance feedback problem





By adding the hit caused by the new relevant and nonrelevant
samples to the map units,
convolving them with the mask used,
And renormalizing the distributions to unit sums
Let us denote the history of the query up to the t –
1’th round by H t 1  ( D0 , R0 , D1 , R1 ,..., Dt 1 , Rt 1 )
Maximize the current probability of relevance
P( x  xrel | H t 1 )
23
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Bayesian Decision Estimation (cont.)
24
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Conclusions



The entropy of the distribution characterizes quantitatively the
compactness of an object class.
Proposed method can be used as an efficient way of comparing
these features and the SOMs produced with them.
We showed that the mutual information of the distributions



could be used to identify both the most similar and the most
uncorrelated of features
can also be used to select the subset of the feature extraction
methods with the most independent features.
Bayesian decision

used for choosing either the most probable class for a data item, or the
most likely data item belonging to a given class.
25
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Personal Opinions

Advantage


Application


Combined entropy & mutual information & smooth
method to find the important feature and independent
of features.
Feature extraction
Drawback


The structure of this paper is not good,
Some diagram is not clear, so difficult to understand
26
Intelligent Database Systems Lab

Class Distributions

Transcript Class Distributions

Directory