Face Recognition from Face Motion Manifolds using Robust
Download
Report
Transcript Face Recognition from Face Motion Manifolds using Robust
An Illumination Invariant Face
Recognition System for Access Control
using Video
Ognjen Arandjelović
Roberto Cipolla
Funded by Toshiba Corp. and Trinity College, Cambridge
Face Recognition
•
Single-shot recognition – a popular
area of research since 1970s
•
Many methods have been developed
•
Bad performance in presence of:
– Illumination variation
– Pose variation
– Facial expression
– Occlusions (glasses, hair etc.)
Eigenfaces
3D Morphable
Models
Wavelet methods
Face Recognition from Video
•
Face motion helps resolve ambiguities of single shot recognition – implicit 3D
•
Video information often available (surveillance, authentication etc.)
Recognition setup
Training stream
Novel stream
Face Manifolds
•
Face patterns describe manifolds which are:
– Highly nonlinear, and
– Noisy, but
– Smooth
Facial features
Face pattern manifold
Face region
Limitations of Previous Work
•
In this work we address 3 fundamental questions:
– How to model nonlinear manifolds of face motion
– How to achieve illumination and pose robustness
– How to choose the distance measure
?
Face Motion Manifolds: Revisited
Unchanging identity, changing illumination
Changing identity, unchanging illumination
•
Motivation: How can we use the prior knowledge on the shape of the
manifolds?
Pose Clusters
•
Face motion manifolds are nonlinear, but:
– Low-dimensional (c.f. registration for the reduction of the
dimensionality), and
– Key observation: can be described well using only 3 linear pose clusters
Colour-coded pose clusters for 3 manifolds
Determining Pose Clusters
•
Pose clusters are semantic clusters:
– K-means and similar algorithms are unsuitable
– We are using a simple method based on the motion parallax
– Membership decided based on Maximum Likelihood
Yaw measure
0.5
xreye xleye xrnostril xlnostril
xreye xleye
Pupils
Distribution for 3 clusters
Discrepancy η
Image plane
Pose Clusters: Example
Input manifold and colour-coded
pose clusters
Sample frames from
the 3 pose clusters
Illumination compensation
•
Performed in two stages:
– Coarse illumination compensation (exploiting face smoothness)
– Fine illumination compensation (exploiting low dimensionality of the face
illumination subspace)
Illumination Subspace
Input
Output
RGIC
Optimization
Reference Cluster
Region-based GIC
Gamma Intensity Correction (GIC)
* arg min I ( x, y ) I C ( x, y )
I * I ( x, y )
x, y
*
2
Solved by 1D non-linear optimization
Canonical image
Face regions
•
Region-based GIC (RGIC): faces are (roughly) divided into
regions with smoothly varying surface normal
Varying Gamma
1
2
3
4
Region-based GIC: Artefacts
•
Region-based GIC suffers from artefacts at region boundaries
Mean face
γ value map
Smoothed γ map
Input face
RGIC face
Our method
Boundary artefacts
Artefacts
removed
Illumination Subspace
•
Each input frame corrected for a linear Pose Illumination Subspace
component to match the reference distribution of the same pose
– Illumination subspace is high-dimensional
– Constrained to expected variations by Mahalanobis distance
Input manifold
Illumination Subspace
I * I B I a*
Subject to:
a* arg min I B I a
M
Where ... M is the Mahanalobis distance
Reference manifold
in the reference Gaussian
Illumination Compensation Results
Strong side lighting
Original/input frames
Illumination-corrected
frames
Reference frames
And in face pattern space…
Comparing Pose Clusters
Reference cluster
Reduced spread
Novel cluster
Cluster
centres
•
“Distribution-based” distances (Kullback-Leibler divergence, Resistor Average
Distance etc.) unsuitable
•
We use the simple Euclidean distance between cluster centres
Unified Manifold Similarity
•
Recognition based based on the likelihood ratio:
P( D1,2,3 | s)
Manifolds belong to the same person
P( D1,2,3 | s)
Distances between pose clusters
•
Learn likelihoods from ground truth training data
Likelihood
histogram
RBF-interpolated
likelihood
Undefined value regions
Two-pose interpolated
likelihood
Likelihood now
monotonically decreasing
Face Video Database Revisited
•
Testing performed under extreme, varying illuminations
10 illumination conditions used (random 5 for training, others for testing)
Registration
•
Linear operations on images are highly nonlinear in the pattern space
•
Translation/rotation and weak perspective can be easily corrected for directly
from point correspondences
– We use the locations of pupils and nostrils to robustly estimate the optimal
affine registration parameters
Translation
manifold
Skew
manifold
Rotation
manifold
Registration Method Used
•
Feature localization based on the combination of shape and pattern matching
(Fukui et al. 1998)
Detect
features
Crop & affine
register faces
Results
•
Very high recognition rates attainted (96% average) under extreme variations
in illumination
•
Other methods showed little to no illumination invariance
Results, continued
•
The method was shown to give promising results for authentication uses:
– Good separability of inter- and intra- class manifold distances was found
– It can provide a secure system with only 0.1% false positive rate and 8%
false negative rate
Cumulative distributions
of inter- and intra- class
manifold distances
The ROC curve for
the proposed method
Future Research
•
Non-constant illumination within a single sequence causes problems
•
Illumination compensation is still not perfect – pose illumination subspaces have
unnecessarily high dimensions
•
Pose estimation is too primitive – outliers cause problems in estimation of linear
subspaces
•
Complete pose invariance is still not achieved (what if there are no
corresponding pose clusters?)
For suggestions, questions etc. please contact me at: [email protected]