No Slide Title - FSU Computer Science Department

Download Report

Transcript No Slide Title - FSU Computer Science Department

Research Activities at
Computer Vision and Image Understanding Group
Florida State University
Xiuwen Liu
Florida State Vision Group
Department of Computer Science
Florida State University
http://fsvision.cs.fsu.edu
Outline
 Motivations
• Some applications of computer vision techniques

Computer Vision and Image Understanding Group
 Some
of the research projects
 Contact
information
Introduction

An image patch represented by hexadecimals
Introduction - continued
 Fundamental
problem in computer vision
• Given a matrix of numbers representing an image, or a
sequence of images, how to generate a perceptually
meaningful description of the matrix?
– An image can be a color image, gray level image, or other format
such as remote sensing images
– A two-dimensional matrix represents a signal image
– A three-dimensional matrix represents a sequence of images
 A video sequence is a 3-D matrix
 A movie is also a 3-D matrix
Introduction
- continued
Introduction - continued
 Why
do we want to work on this problem?
• It is very interesting theoretically
– It involves many disciplines to develop a
computational model for the problem
• It is the key component to understand and model
intelligence
– Note that 50% of the brain is devoted to vision
• It has many practical applications
– Internet applications
– Movie-making applications
– Military applications
Computer Vision Applications
 No
hands across America
• sponsored by Delco Electronics, AssistWare Technology,
and Carnegie Mellon University
• Navlab 5 drove from Pittsburgh, PA to San Diego, CA,
using the RALPH computer program.
• The trip was 2849 miles of which 2797 miles were driven
automatically with no hands
– Which is 98.2%
Computer Vision Applications – continued
Computer Vision Applications – continued
DARPA Grant Challenge: http://www.darpa.mil/grandchallenge/index.htm
Computer Vision Applications – continued
 Military
applications
• Automated target recognition
Computer Vision Applications – continued
Computer Vision Applications – continued

Extracted hydrographic regions
Computer Vision Applications – continued
 Medical
image analysis
• Characterize different types of tissues in medical images
for automated medical image analysis
Computer Vision Applications – continued
Computer Vision Applications – continued
 Biometrics
• From faces, fingerprints, iris patterns .....
• It has many applications such as security, ATM
withdrawal, credit card managements .....
Computer Vision Applications – cont.
Computer Vision Applications – continued
 Content-based
image retrieval has become an
active research area to meet the needs of
searching images on the web in a meaningful
way
• Color histogram has been widely used
Content-Based Image Retrieval – cont.
Vision-Based Image Morphing
Vision-Based Image Morphing - continued
Computer Vision and Image Understanding Group

Faculty:
Xiuwen Liu, Anuj Srivastava, Washington
Mio, Eric Klassen

Goals:
Develop and implement effective image
understanding algorithms and systems for
images and videos from multi modalities
including visible, infrared, and range sensors

Approaches: Learning-based vision algorithms, statistical
modeling of objects, computational modeling
and analysis of textures, statistical modeling
of shapes, stochastic optimization, inference
algorithms on manifolds, and Bayesian
inference
Research Projects
 The
group offers a wide range of research
possibilities
•
•
•
•
Implementation projects
Development of new applications
Development of new algorithms
Theoretical and mathematical analysis of algorithms
Implementation Projects
 These
projects involve implementing
proven ideas and algorithms on specific
datasets with specific interface and
programming language constraints
• For example, Haitao Wu implemented a
graphical user interface for a face recognition
algorithm we have as his Masters project
• Yu Wang implemented a web-based interface for
a content-based image retrieval algorithm
A Real-time Recognition/Tracking System
Content-based Image Retrieval
Image Query System by Yu Wang
Future Implementation Possibilities
 Implement
a Java-based system for face
detection
 Implement a Java-based system for learning
 Implement and improve web-based systems for
content-based image and video retrieval
Generic Image Modeling

How can we characterize all these images perceptually?
Spectral Histogram Representation
 Spectral
histogram
• Given a bank of filters F(a), a = 1, …, K, a spectral
histogram is defined as the marginal distribution of filter
responses
I(a ) (v)  F (a ) * I(v)
H
(a )
I
1
(a )
( z) 
δ
(
z

I
(v))

|I| v
H I  ( H I(1) , H I( 2) ,, H I( K ) )
Spectral Histogram Representation - continued
 Choice
•
•
•
•
of filters
Laplacian of Gaussian filters
Gabor filters
Gradient filters
Intensity filter
LoG filter
Gabor filter
Spectral Histogram Representation - continued
A Texture Synthesis Example

A white noise image was
transformed to a
perceptually similar texture
by matching the spectral
histogram
Average spectral histogram error
Texture Synthesis Examples - continued
Observed image
 A random
texture image
Synthesized image
Texture Synthesis Examples - continued
Observed image
 An
Synthesized image
image with periodic structures
Texture Synthesis Examples - continued
Mud image
Synthesized image
 A mud
image with some animal foot prints
Texture Synthesis Examples - continued
Observed image
Synthesized image
 A random
texture image with elements
Object Synthesis Examples


As in texture synthesis, we start from a random image
In addition, similar object images are used as boundary conditions in that
the corresponding pixel values are not updated during sampling process
Object Synthesis Examples - continued
Object Synthesis Examples - continued
Principal Component Analysis
Eigen Values of 400 Eigen Vectors
Principal Component Analysis - continued
Original Image
Reconstructed
using 50 PCs
Reconstructed
using 200 PCs
Principal Component Analysis - continued
Principal Component Analysis - continued
Difference Between Reconstruction and Sampling
Reconstruction is not sufficient to show the adequacy of a representation and
sampling from the set of images with same representation is more informational
Face detection based on spectral representations
 Face
detection is to detect all instances of faces in a
given image
 Each image window is represented by its spectral
histogram
• A support vector machine is trained on training faces
• Then the trained support vector machine is used to classify
each image window in an input image

More results at http://fsvision.fsu.edu/face-detection
Face detection - continued
Face detection - continued
Face detection - continued
Rotation invariant face detection
Rotation invariant face detection - continued
Linear representations
 Linear representations
are widely used in appearancebased object recognition applications
• Simple to implement and analyze
• Efficient to compute
• Effective for many applications
a ( I ,U )  U T I  R d
Standard Linear Representations
 Principal
Component Analysis
• Designed to minimize the reconstruction error on the training set
• Obtained by calculating eigenvectors of the co-variance matrix
 Fisher Discriminant Analysis
• Designed to maximize the separation between means of each class
• Obtained by solving a generalized eigen problem
 Independent
Component Analysis
• Designed to maximize the statistical independence among coefficients
along different directions
• Obtained by solving an optimization problem with some object function
such as mutual information, negentropy, ....
Standard Linear Representations - continued
 Standard
linear representations are sub optimal
for recognition applications
• Evidence in the literature [1][2]
• A toy example
– Standard representations give the worst recognition performance
Proposed Approach
 Optimal
Component Analysis (OCA)
• Derive a performance function that is related to the
recognition performance
• Formulate the problem of finding optimal representations
as an optimization one on the Grassmann manifold
• Use MCMC stochastic gradient algorithm for
optimization
Performance Measure
It must have continuous directional derivatives
 It must be related to the recognition performance
 It can be computed efficiently
 Based on the nearest neighbor classifier

• However, it can be applied to other classifiers as it forms clusters of
images from the same class that far from clusters from other classes
• See an example for support vector machines
Performance Measure - continued

Suppose there are C classes to be recognized
• Each class has ktrain training images
• It has kcross cross validation images
Performance Measure - continued

h is a monotonically increasing and bounded function
• We used h(x) = 1/(1+exp(-2bx)
• Note that when b  , F(U) is exactly the recognition performance using
the nearest neighbor classifier

Some examples of F(U) along some directions
Performance Measure - continued
 F(U)
depends on the span of U but is invariant to
change of basis
• In other words, F(U)=F(UO) for any orthonormal matrix
O
• The search space of F(U) is the set of all the subspaces,
which is known as the Grassmann manifold
– It is not a flat vector space and gradient flow must take the
underlying geometry of the manifold into account; see [3] [4] [5]
for related work
Deterministic Gradient Flow - continued

Gradient at [J] (first d columns of n x n identity matrix)
Deterministic Gradient Flow - continued
 Gradient

at U: Compute Q such that QU=J
Deterministic gradient flow on Grassmann manifold
Stochastic Gradient and Updating Rules

Stochastic gradient is obtained by adding a stochastic
component

Discrete updating rules
MCMC Simulated Annealing Optimization Algorithm

Let X(0) be any initial condition and t=0
1.
2.
3.
4.
5.
6.
7.
Calculate the gradient matrix A(Xt)
Generate d(n-d) independent realizations of wij’s
Compute Y (Xt+1) according to the updating rules
Compute F(Y) and F(Xt) and set dF=F(Y)- F(Xt)
Set Xt+1 = Y with probability min{exp(dF/Dt),1}
Set Dt+1 = Dt / g and set t=t+1
Go to step 1
The Toy Example

The following result on the toy example shows the
effectiveness of the algorithm
• The following figure shows the recognition performance of Xt and
F(Xt)
ORL Face Dataset
Experimental Results on ORL Dataset

Here the size of image is 92 x 112, d = 5 (subspace)
• Comparison using gradient, stochastic gradient, and the proposed
technique with different initial conditions
PCA
ICA
FDA
Results on ORL Dataset - continued
 With respect to d and ktrain
d=3
ktrain=5
d=5
ktrain=1
d=10
ktrain=5
d=5
ktrain=2
d=20
ktrain=5
d=5
ktrain=8
Results on CMU PIE Dataset
 Here
we used part of the CMU PIE dataset
• There are 66 subjects
• Each subject has 21 pictures under different lighting conditions
-X0=PCA
-d=10
-X0=ICA
-d=10
-X0=FDA
-d=5
Some Comparative Results on ORL

Comparison where performance on cross validation images is maximized
• In other words, the comparison is to show the best performance linear
representations can achieve
• PCA – black dotted; ICA – red dash-dotted;
FDA – green dashed; OCA – blue solid
Some Comparative Results on ORL - continued

Comparison where the performance on the training is optimized
• In other words, it is a fair comparison
• PCA – black dotted; ICA – red dash-dotted;
FDA – green dashed; OCA – blue solid
PROBABILITY MODELS FOR IMAGE ANALYSIS
Empirical Studies Indicate Patterns
Histogram of x-derivative
Need models that:
• are low-dimensional (computationally tractable)
• are accurate models of (real) observed clutter
• support the observed patterns
BESSEL K FORM
A Parametric Family:
K is the modified Bessel function of third kind
•Image statistics (under spectral decompositions) exhibit non
Gaussian statistics.
•This density explains the non-Gaussian and heavy-tail
nature of observed image statistics.
•The parameters p and c are easily estimated from the data
using sample variance and kurtosis.
•This model is derived from first principles.
MODELING SUCCESS
Original Image
Filtered Image
Gabor Filter
Observed
Statistics of Filtered Image
Bessel K
SHAPE ANALYSIS
•Represent shapes as elements of infinite-dimensional manifolds
•Analyze shapes using geometry of that manifold
-- connect shapes using geodesic paths on the manifold
-- quantify shape differences using geodesic lengths
-- compute shape statistics (mean, variance)
•Applications:
-- clustering of objects according to shapes (learning)
-- shape based recognition of objects (recognition)
-- predicting shapes of partially-obscured objects (completion)
GEODESIC PATHS ON SHAPES
Basic Idea: Given two shapes (far left and far right), we
connect them using a geodesic path on the
shape manifold.
Example
Second Shape
First Shape
Eight shapes along geodesic path
Fish shapes taken from Surrey database
MEAN SHAPES
Their Mean Shape
Four Sample Shapes
CLUSTERING OF SHAPES
Results: 7 resulting clusters, each row is a cluster
3D Model-Based Recognition
Medical Image Analysis
 Advances
in medical imaging provide many new
opportunities and challenges for computer vision
research
 Automated medical image analysis
Medical Image Analysis - continued
Medical Image Analysis - continued
Medical Image Analysis - continued
Medical Image Analysis - continued
Video Sequence Analysis and Summary
 Motion
analysis based on correspondence
 Video stream-based surveillance
 Video summary
Courses
 Most
Relevant Courses
• CAP 5638 Pattern Recognition (Spring 2004)
• CAP 5415 Principles and Algorithms of Computer Vision
– Fall 2004
• CAP 6417 Theoretical Foundations of Computer Vision
– STA 5106 Computational Methods in Statistics I
– STA 5107 Computational Methods in Statistics I I
– Seminars and advanced studies
 Related
Courses
• CAP 5615 Artificial Neural Networks
• CAP 5600 Artificial Intelligence
• CAP 5xxx Machine Learning
CAP 5638 Pattern Recognition

It will be offered Spring 2004
•
•
•
•

Tuesday and Thursday 6:45-8:00 PM
At Love 103
The course ref #: 07842
http://www.cs.fsu.edu/~liux/courses/cap5638/
It will cover
• The basics for pattern recognition
– Neural networks
• Machine learning algorithms
• Applications in data mining, pattern discovery, artificial intelligence,
and security,

It should be interesting to anyone interested in more
intelligent computer learning algorithms
Funding of the Group
 National
•
•
•
•
Science Foundation
DMS
CISE IIS
FRG
ACT
 National
Imagery and Mapping Agency
• NGA – National Geo-spatial Intelligence Agency
 Army
Research Office
Summary
 Florida
State Vision group offers many
interesting research topics/projects
• Efficient represent for generic images
• Computational models for object recognition and image
classification
• Medical image analysis
• Motion/video sequence analysis and modeling
• They are challenging
• They are interesting
Contact Information
•
•
•
•
•
Name
Web site at
Email at
Office at
Phone
Xiuwen Liu
http://fsvision.fsu.edu
http://www.cs.fsu.edu/~liux
[email protected]
LOV 166
644-0050