History of Human-Computer Interaction
Download
Report
Transcript History of Human-Computer Interaction
Human-Computer Interaction
Tracking
Hanyang University
Jong-Il Park
Tracking in computer vision
Definition
Problem of generating an inference about the
motion of an object given a sequence of
images
Application
Motion
capture
Recognition from motion
Surveillance
Who is doing what?
Eg. Security, HCI(Kinect…)
Tracking
Establish state of object using time sequence
state could be:
position; position+velocity; position+velocity+acceleration
or more complex, Eg. all joint angles for a person
Biggest problem -- Data Association
which image pixels are informative, which are not?
Key ideas
Tracking by detection
if we know what an object looks like, that selects the
pixels to use
Tracking through flow
if we know how an object moves, that selects the pixels
to use
Appearance vs. Flow
Tracking by Detection
Assume
a very reliable detector (e.g. faces; back of heads)
detections that are well spaced in images (or have
distinctive properties)
e.g. news anchors; heads in public
Link detects across time
only one - easy
multiple - weighted bipartite matching
but what if one is missing?
Better: create abstract tracks
link detects to track
create tracks, reap tracks as required
clean up spacetime paths
Tracking by Known Appearance
Even if we don’t have a detector
Know rectangle in image (n)
Want to find corresponding rectangle in (n+1)
Search over nearby rectangles
to find one that minimizes SSD error
where sum is over pixels in rectangle
Application
stabilize players in TV sport
Buidling Tracks
Start at scattered points in image 1
perhaps corner detector responses
For each in image (n), compute position in (n+1)
as in previous slide
Now check tracks
patch in (n+1) should look like an affine transform
of patch in 1
Prune bad tracks
What if the patch deforms?
Eg a football player’s jersey
Colors
are “similar” but SSD won’t work
Idea: patch histogram is stable
To track:
repeat
predict location of new patch
search nearby for patch whose histogram matches
original the best
When are motions easy?
Current procedure
predict state
obtaining measurement from prediction by search
correct state
Easy
When the object is close to where you expect it to be
eg Object guaranteed to move a little
Large motions can be easy
When they’re “predictable”
e.g. ballistic motion
e.g. constant velocity
Need a theory to fuse this procedure with motion model
General Model
We assume there are moving objects, which
have an underlying state X
There are measurements Y, some of which
are functions of this state
Eg. There is a clock
at each tick, the state changes
at each tick, we get a new observation
object
is ball, state is 3D position+velocity,
measurements are stereo pairs
object is person, state is body configuration,
measurements are frames, clock is in camera
(30 fps)
Tracking as
an Abstract Inference Problem
Internal state: X
Measurement: Y
Major Steps:
1.
Prediction
2.
Data association: determining which data are
informative
3.
Correction
Independence Assumption
Only the immediate past matters
* X: Markov process
Measurements depend only on the current
state
Bayes Rule
Prediction
Correction
Linear Dynamic Model
Model
D: Transition matrix
M: Measurement matrix
Kalman filter is well-suited for this type of
tracking
Read Ch.11.3, Forsyth & Ponce, Computer
Vision, 2012.
Kalman filter
Eg. Kalman filter
*: predicted
x: measured
+: estimated
o: true
bar: 3s
Nonlinear Dynamics
Problems
tend not to be normal
may not be Gaussian(quite common
in vision problem)
multiple, well-separated modes
Eg. Nonlinear model
Eg. Nonlinear model
Time Evolution & pdf
Difficulties
in Complicated Models
In order to maintain
handling
multiple peaks
handling multi-dimensional state vector
How?
Particle filtering is a useful approach.
Sampled Representation
Representation of pdf is
NOT
for representing a pdf itself
BUT for computing some expectation
Computing expectation using sampled representation
Monte Carlo Integration
where
: sampling distribution
: weight
Obtaining a sampled representation
of probability distribution
Computing an expectation using
a set of samples
Transformation
Transforming a sampled representation of
a prior into that of a posterior
Naive particle filter
Eg. Naive particle filter
Poor results due to sample impoverishment!
Most of the weights get small very fast
The way to get accurate estimates
= to have samples lie where the p is likely to
be large
Overcoming Sample Impoverishment
Equivalent to maintaining a set of good
particles
Then, how?
Resampling the prior
1.
expand the sample set non-uniformly using
the weights
2.
Form a new set of samples consisting of a
union of Nk copies of (sk,1) for each k.
Subsample the sample set uniformly
Practical particle filter
Initialization
Prediction
Naive particle filter
Correction
Resampling:
1. Normalise the weights so that
2. Compute the variance of the normalised
weights.
If(var >Th) construct a new set of samples by
drawing, with replacement, N samples from
the old set, using the weights as the
probability that a sample will be drawn.
The weight of each sample is now 1/N.
Consequence of resampling
Particles that tend to reflect the state rather
well usually reappear in the resampled set
Many particles lie within one standard
deviation of the mean of the posterior
Particle filters
Different community -> different name
statistics
particle filter
AI
survival of the fittest
computer vision
condensation
Tracking people
Essential components
Motion
model
Likelihood model
P(image features|person present at given
configuration)
Motion model
Strong
motion model: markers, angles,…
Weak motion model: drift model
Likelihood model
SSD,
edges, other features,…
Likelihood computation
Boundary
information
Non-background
information
Sample points
Annealed particle filter
To overcome the problems of
High
dimensionality of the state
Many local peaks in the likelihood
Annealing
Starts
from smooth approximations to the
likelihood to less smooth approximation
Repeats weighting and resampling
[Deutscher,Blake,Reid, CVPR2000]
Finding people
As an initialization of people tracker
There is no person tracker that represents
the configuration of the body and can start
automatically
Known approaches
1.
2.
3.
Template matching
Finding faces
Search over correspondence
* Challenging topic: learning models from data