Using spatio-temporal probabilistic framework for object tracking

Download Report

Transcript Using spatio-temporal probabilistic framework for object tracking

Using spatio-temporal probabilistic
framework for object tracking
Emphasis on Face Detection & Tracking
By: Guy Koren-Blumstein
Supervisor: Dr. Hayit Greenspan
Agenda
► Previous
research overview (PGMM)
► Under-segmentation problem
► Face tracking using PGMM
 Modeling skin color in [L,a,b] color space –
over-segmentation problem
► Optical
flow – overview
► Approaches for using optical flow
► Examples
Previous research
► Complementary
research to an M.Sc. Thesis
research conducted by A.Mayer under the
supervision of Dr. H.Greenspan and Dr. J.
Goldberger.
► Research Goal: Building a probabilistic framework
for spatio-temporal video representation.
► Useful for:
 Offline – automatic search in video databases
 Online – characterization of events and alerting on
those that are defined as ‘suspicious’
Previous research
Source Clip
BOF 1
Labeled BOF Blob Extraction
Parsing clip to
BOF
Build feature
Space [L a b]
Build GMM model
In [Lab] space
Under segmentation problem…
Label BOF pixels
Connect. Comp.
On [L,a,b,x,y,t]
Learn GMM model
On [L,a,b,x,y,t]
Face Detection & Tracking
► Most
of the known techniques can be
divided into two categories :
 Search for skin color and apply shape analysis
to distinguish between facial and non-facial
objects.
 Search for facial features regardless of pixel
color (eyes,nose,mouth,chin,symmetry etc.)
Apply framework to track faces
► The
framework can extract and track after
objects in an image sequence.
► Applying shape analysis to each skincolored-blob can label the blob as ‘face’ or
‘non-face’.
► The face will be tracked by virtue of the
tracking capabilities of the framework
Skin color in [L a b]
►
►
►
Skin color is modeled in [a b] components only
Supplies very good discriminability between ‘skin’ pixels
and ‘not-skin’ pixels (high rate of True-Negative)
Not optimal in terms of True-Positive (leads to misdetection of skin color pixels)
Over-segmentation of faces
►
►
►
Building blobs is done in [L a b] color space.
More than one blob might have skin color [a b]
components
Solution : Unite all blobs whose [a b] are close enough to
the skin color model (adaptive TH can be used)
Under Segmentation
► Faces
moving in front of skin-color background are
not extracted well.
► Applying shape analysis on the middle map yields
mis-detection of faces.
Employing motion information
► Motion
information helps to distinguish
between foreground dynamic objects and
static background
► 2 levels of motion information
 Binary – indicates for each pixel whether it is in
motion or not. Does not supply motion vector.
Feature space: [L a b x y t m] where m={0,1}
 Optical flow - supplies motion vector according
to a given model.
Feature space: [L a b x y t Vx Vy]
Is binary information good enough?
Optical Flow
► Optical
flow is an apparent motion of image
brightness
► If I(x,y,t) is the brightness, two main
assumptions can be made:
 I(x,y,t) depends on coordinates x,y in greater
part of the image
 Brightness of every point of moving object does
not change in time
Optical Flow
object is moving during time dt and its
displacement is (dx,dy) then using Taylor series
► If
I
I
I
I ( x  dx, y  dy, t  dt )  I ( x, y, t )  dx  dy  dt  ...
x
y
t
► According
to assumption 2:
I ( x  dx, y  dy, t  dt )  I ( x, y, t )
I
I
I
dx  dy  dt  ...  0
x
y
t
► Dividing
by dt gives the optical flow equation:
I
I
I
u v
x
y
t
dx
dy
u  ,v 
dt
dt
Optical Flow – Block Matching
►
►
►
►
Does not use the equation
directly.
Divides the image to
blocks
For every block in It it
search for the best
matching block in It-1.
Matching criteria: Cross
Correlation, Square
Difference, SAD etc.
Working with 8-D feature space
►
Connected component
analysis:
 Does not require
initialization of the order of
the model
 Hard decision prone
►
GMM model via EM:
 Initialized by K means.
Requires initialization of K.
 Impose elliptic shape on the
objects
 Soft Decision prone
Parsing clip to
BOF
Build feature
Space [L a b]
Build GMM model
In [Lab] space
Label BOF pixels
Connect. Comp.
On [x,y,t,Vx,Vy]
Learn GMM model
[x,y,t,Vx,Vy]
Frame By Frame
Tracking
Frame by frame tracking
► Widely
used in the
literature
► Can handle variations
in object’s velocity
► Tracking can be
improved by employing
Kalman filter to predict
object’s location and
velocity
Label by predicted
parameters
Update blob’s params
Label by updated
parameters
Kill old blobs
split blobs
Create new blobs
merge blobs
Predict params for
next frame
Examples
► Opposite
directions:
 Optical Flow, Connected component (Extracted Faces),
GMM
► Same
direction, different velocity
 Optical Flow, Connected component, GMM (Faces)
► Different
directions – complex background
 Optical Flow, Connected component, GMM:
K=5,K=3,Faces
► Variable
velocity
 Optical Flow, Connected component, GMM, Frame By
Frame
Real World Sequences
►
Face tracking





►
Optical Flow
No motion info
Connected component
GMM
Frame By Frame
Car Tracking
 Optical Flow
 No Motion info
 GMM
►
Flower garden




Optical Flow
No motion info
Connected component
GMM
Summary
► Applying
probabilistic framework to track faces in
video clips
► Working in [L,a,b] color space to detect faces
► Handling over segmentation
► Handling under segmentation by employing optical
flow information in 3 different ways:
 Connected Component Analysis
 Learning GMM model
 Frame By Frame tracking
Further Research
► Adaptive
face color model
► Variable length BOF (using MDL)
► Using more complex motion model
Thank you for listening
Questions ?