Object Recognition

Download Report

Transcript Object Recognition

Object Recognition
So what does object recognition involve?
Verification: is that a bus?
Detection: are there cars?
Identification: is that a picture of Mao?
Object categorization
sky
building
flag
banner
face
wall
street lamp
bus
bus
cars
Challenges 1: view point variation
Michelangelo 1475-1564
Challenges 2: illumination
slide credit: S. Ullman
Challenges 3: occlusion
Magritte, 1957
Challenges 4: scale
Challenges 5: deformation
Xu, Beihong 1943
Challenges 7: intra-class variation
Two main approaches
Global sub-window
Part-based
Global Approaches
x1
Aligned images
x2
x3
Vectors in highdimensional space
Global Approaches
Vectors in high-dimensional
space
x1
x2
x3
Training
Involves some
dimensionality
reduction
Detector
Detection
– Scale / position range to search over
Detection
– Scale / position range to search over
Detection
– Scale / position range to search over
Detection
– Combine detection over space and scale.
PROJECT 1
•
•
•
•
•
•
Turk and Pentland, 1991
Belhumeur et al. 1997
Schneiderman et al. 2004
Viola and Jones, 2000
Keren et al. 2001
Osadchy et al. 2004
• Amit and Geman, 1999
• LeCun et al. 1998
• Belongie and Malik, 2002
• Schneiderman et al. 2004
• Argawal and Roth, 2002
• Poggio et al. 1993
Object Detection
Problem:
Locate instances of object category in a given
image.
Asymmetric classification
Background
problem!
Object (Category)
Very large
Relatively small
Complex (thousands of
categories)
Large prior to appear in
an image
Simple (single
category)
Small prior
Easy to collect (not easy to
Hard to collect
learn from examples)
Intuition
Black H is better!
Object class
H
All images
All images
H
 Background
Background
We have a prior on
the distribution of all
natural images

Denote H to be the acceptance region of a
classifier. We propose to minimize the
Pr(All images) (  Pr(bkg)) in H except for the
object samples.
Distribution of Natural Images –
Boltzmann distribution


Pr( I )  exp    I  I
Lower probability
2
x
2
y
dxdy
Image smoothness
measure
In frequency domain:

2
2
2 
Pr( x )  exp     k  l xk ,l 
k ,l


Lower probability
Antiface
Lower probability
d
Ω
object images
Lower probability
Acceptance
region
Main Idea
Claim: for x, y random natural images viewed as
unit vectors, x, y is large on average.
Anti-Face detector is defined as a vector d satisfying:
–
d, x   for all x positive class
– d is smooth
d, x   is large on average for random natural image.
Discrimination
If x is an image and  is a target class:
x
x
x
d, x SMALL
d, x LARGE
Cascade of Independent Detectors
d1
7 inner products
4 inner products
d2
d3
PROJECT 2
Detect road signs in video
1)Use antiface method to learn a road sign under viewpoint variation
2)Use sign spatial location in the frame as an additional cue
3)Use scale change as an additional cue
4) Use evidence integration to combine evidence of sign presence in video
stream.
Training with small number of
examples
• Majority of object detection method require
a large number of training examples.
• Goal: to design a classifier that can learn
from a small number of examples
• Train existing classifiers on few examples
Overfiting: learns by hart the training
examples, performs poor on unseen
examples.
Linear SVM
Class 1
Class 2
Maximal margin
Enough training data
Not Enough training data
Linear SVM –Detection Task
  x w x  b  0
Class 1
Class 2
MM with prior
1) min P(natural images)  in H
2) postive samples  H
3) wide margin


  x wB x  b  0
Object class
Other Priors?
• Current prior uses the simplest features –
DCT. These features are not robust to
deformations.
• State-of-state of the art features – SIFT:
local image features that are invariant to
translation, rotation, scale. In addition,
minor variations in illumination and
viewpoint.
SIFT – Scale Invariant Feature Transform
• Descriptor overview:
– Determine scale, local orientation as the dominant gradient
direction. Use this scale and orientation to make all further
computations invariant to scale and rotation.
– Compute gradient orientation histograms of several small
windows (128 values for each point)
– Normalize the descriptor to make it invariant to intensity
change
David G. Lowe, "Distinctive image features from scale-invariant keypoints,“
International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
PROJECT 3
SIFT statistics:
The goal of the project is to learn the statistics of state of the art
features – SIFT to design a prior for recognition of images
represented by SIFTs.
Patch-Based Face Representation
• Patched-based representation of a human
face has several advantages
– It can be used in privacy preserving
applications where the identity of the person,
specifically its photo is classified.
– It can be used in face identification with
occlusions, such as glasses, facial hair, etc.
– Since local patches can be assumed planar, it
can also remove the effect of illumination
change.
Patch-Based Face Representation
• A face is represented by a collection of
informative patches:
Patch centers
Patch size
–could vary
• Assume that the face is represented by N
patches.
Gallery
1
2
Public database of faces –
M faces
N
Indexing
14
1 14
2 8
V=
…
N5
Resulting vector V could be used for face recognition, but the picture of the
person is not saved, thus it cannot be misused.
Recognition
Enrolled people
1 14
V1
2 8
V2
…
V is
matched
to each Vi
(i=1..k)
using
Hamming
Distance
V=
…
N5
Can be done more
robust – see
project description
Vk
PROJECT 4
“Clusteron”
This project investigates a new patched-based
representation of a human face and applies it to face
identification.
Lighting changes objects appearance.
How do we recognize these objects?
Lambertian
Specular
Few Definitions: Reflection
• Reflection - The
scattering of light from
an object.
• Two extreme cases:
diffuse reflection and
specular reflection.
• Real objects reflect
light as a mixture of
these two extremes.
Few Definitions: Lambertian Reflection
• Surface reflects equally in
all directions.
– Examples: chalk, clay, cloth,
matte paint
• Brightness doesn’t
depend on viewpoint.
• Amount of light striking
surface proportional to
cos θ.
albedo
surface normal
NLL,0
I DI
 DKmax
D K
D N
intensity
(light intensity)*
(light direction)
Few Definitions: Specular Reflection
• Specular surfaces reflect
light more strongly in
some directions than in
others.
• Appearance of a surface
depends on the direction
L of the light source,
direction of the surface
normal N, and direction V
of viewing.
The vectors L, N and R all lie in one plane
Few Definitions: Specular Reflection
R
N
θ
L
θ
• Perfect mirror: The angle
of incidence equals the
angle of reflection.
mirror
N
L
rough specular
R
• Rough specular : Most
specular surfaces reflect
energy in a tight
distribution (or lobe)
centered on the optical
reflection direction
– Examples: metals,glass
Few Definitions: Phong Model
L
N
l r
R

I s  K s L cos n 
V
• Determine the angle α
between the direction V of
viewing and the direction
R of reflection by an ideal
mirror.
• Assume the intensity of
reflected light is
proportional to cos(α)
• The exponent n (“shine”)
is determined empirically.
• Large values of n make
the surface behave more
like an ideal mirror.
• Phong’s exponent controls how fast the highlight
“falls-off”
Main Approaches
Lambertian
2D methods based on
quasi-invariance to lighting
Model- based: 3D to 2D
3D
Low dimensional
representation of
an object’s image
set under different
lightings
image
50
50
100
100
150
150
rendering
200
200
-50
0
50
250
100
150
-50
200
0
50
compare
100
150
200
compare
Main Approaches
Specular
?
Apply Lambertian methods and treat
specularities as noise
2D Methods: will be
distracted by
highlights and lack of
real edges.
3D Methods: Specular
objects cannot be well
approximated by lowdimensional linear subspaces.
Use specularities for recognition
Mapping
image
L
N
l r
R Gaussian sphere

V
Finding Specularity
threshold
query
consistent
map onto the
sphere
specular
candidates
map back
recovered highlights
specularity disk
Wrong Match
threshold
query
inconsistent
map onto the
sphere
specular
candidates
map back
recovered highlights
specularity disk
PROJECT 5
“Specularity detection”
Assume that there are two types of points on a 3D sphere.
A plane intersect a sphere in a disk.
1)Find a plane that separate points into two regions: a disk
and the rest of the sphere with the minimal number of
misclassification. (classification algorithm)
2)Test it on specularities obtained from images of real
objects using mapping via 3D normals. (scan models using
3d scanner and take their pictures under different lighting
directions).