History of Human-Computer Interaction

Download Report

Transcript History of Human-Computer Interaction

Human-Computer Interaction
Tracking
Hanyang University
Jong-Il Park
Tracking in computer vision
 Definition
Problem of generating an inference about the
motion of an object given a sequence of
images
 Application
 Motion
capture
 Recognition from motion
 Surveillance


Who is doing what?
Eg. Security, HCI(Kinect…)
Tracking
 Establish state of object using time sequence

state could be:



position; position+velocity; position+velocity+acceleration
or more complex, Eg. all joint angles for a person
Biggest problem -- Data Association

which image pixels are informative, which are not?
 Key ideas

Tracking by detection


if we know what an object looks like, that selects the
pixels to use
Tracking through flow

if we know how an object moves, that selects the pixels
to use
Appearance vs. Flow
Tracking by Detection
 Assume
a very reliable detector (e.g. faces; back of heads)
 detections that are well spaced in images (or have
distinctive properties)


e.g. news anchors; heads in public
 Link detects across time
only one - easy
 multiple - weighted bipartite matching
 but what if one is missing?
 Better: create abstract tracks
 link detects to track
 create tracks, reap tracks as required
 clean up spacetime paths

Tracking by Known Appearance
 Even if we don’t have a detector
 Know rectangle in image (n)
 Want to find corresponding rectangle in (n+1)
 Search over nearby rectangles

to find one that minimizes SSD error
where sum is over pixels in rectangle
 Application

stabilize players in TV sport
Buidling Tracks
 Start at scattered points in image 1
perhaps corner detector responses
 For each in image (n), compute position in (n+1)
 as in previous slide

 Now check tracks
patch in (n+1) should look like an affine transform
of patch in 1
 Prune bad tracks

What if the patch deforms?
 Eg a football player’s jersey
 Colors
are “similar” but SSD won’t work
 Idea: patch histogram is stable
 To track:
 repeat


predict location of new patch
search nearby for patch whose histogram matches
original the best
When are motions easy?
 Current procedure
predict state
 obtaining measurement from prediction by search
 correct state
 Easy
 When the object is close to where you expect it to be


eg Object guaranteed to move a little
 Large motions can be easy

When they’re “predictable”


e.g. ballistic motion
e.g. constant velocity
 Need a theory to fuse this procedure with motion model
General Model
 We assume there are moving objects, which
have an underlying state X
 There are measurements Y, some of which
are functions of this state
 Eg. There is a clock


at each tick, the state changes
at each tick, we get a new observation
 object
is ball, state is 3D position+velocity,
measurements are stereo pairs
 object is person, state is body configuration,
measurements are frames, clock is in camera
(30 fps)
Tracking as
an Abstract Inference Problem
 Internal state: X
 Measurement: Y
 Major Steps:
1.
Prediction
2.
Data association: determining which data are
informative
3.
Correction
Independence Assumption
 Only the immediate past matters
* X: Markov process
 Measurements depend only on the current
state
Bayes Rule
Prediction
Correction
Linear Dynamic Model
 Model
D: Transition matrix
M: Measurement matrix
 Kalman filter is well-suited for this type of
tracking
 Read Ch.11.3, Forsyth & Ponce, Computer
Vision, 2012.
Kalman filter
Eg. Kalman filter
*: predicted
x: measured
+: estimated
o: true
bar: 3s
Nonlinear Dynamics
 Problems
tend not to be normal


may not be Gaussian(quite common
in vision problem)

multiple, well-separated modes
Eg. Nonlinear model
Eg. Nonlinear model
Time Evolution & pdf
Difficulties
in Complicated Models
 In order to maintain
 handling
multiple peaks
 handling multi-dimensional state vector
How?
Particle filtering is a useful approach.
Sampled Representation
 Representation of pdf is
 NOT
for representing a pdf itself
 BUT for computing some expectation
Computing expectation using sampled representation
Monte Carlo Integration
where
: sampling distribution
: weight
Obtaining a sampled representation
of probability distribution
Computing an expectation using
a set of samples
Transformation
 Transforming a sampled representation of
a prior into that of a posterior
Naive particle filter
Eg. Naive particle filter
 Poor results due to sample impoverishment!
 Most of the weights get small very fast
 The way to get accurate estimates
= to have samples lie where the p is likely to
be large
Overcoming Sample Impoverishment
 Equivalent to maintaining a set of good
particles
 Then, how?
Resampling the prior
1.
expand the sample set non-uniformly using
the weights

2.
Form a new set of samples consisting of a
union of Nk copies of (sk,1) for each k.
Subsample the sample set uniformly
Practical particle filter
 Initialization
 Prediction
Naive particle filter
 Correction
 Resampling:
1. Normalise the weights so that
2. Compute the variance of the normalised
weights.
If(var >Th) construct a new set of samples by
drawing, with replacement, N samples from
the old set, using the weights as the
probability that a sample will be drawn.
The weight of each sample is now 1/N.
Consequence of resampling
 Particles that tend to reflect the state rather
well usually reappear in the resampled set
 Many particles lie within one standard
deviation of the mean of the posterior
Particle filters
 Different community -> different name
statistics
particle filter
AI
survival of the fittest
computer vision
condensation
Tracking people
 Essential components
 Motion
model
 Likelihood model
P(image features|person present at given
configuration)
 Motion model
 Strong
motion model: markers, angles,…
 Weak motion model: drift model
 Likelihood model
 SSD,
edges, other features,…
Likelihood computation
Boundary
information
Non-background
information
Sample points
Annealed particle filter
 To overcome the problems of
 High
dimensionality of the state
 Many local peaks in the likelihood
 Annealing
 Starts
from smooth approximations to the
likelihood to less smooth approximation
 Repeats weighting and resampling
[Deutscher,Blake,Reid, CVPR2000]
Finding people
 As an initialization of people tracker
 There is no person tracker that represents
the configuration of the body and can start
automatically
 Known approaches
1.
2.
3.
Template matching
Finding faces
Search over correspondence
* Challenging topic: learning models from data