EE2E1. JAVA Programming

Download Report

Transcript EE2E1. JAVA Programming

EE4H, M.Sc 0407191
Computer Vision
Dr. Mike Spann
[email protected]
http://www.eee.bham.ac.uk/spannm
Contents
 Why evaluate?
 Images – synthetic/natural?
 Noise
 Example 1. Evaluation of
thresholding/segmentation methods
 Example 2. Evaluation of optical flow methods
Why evaluate?
 Computer vision algorithms are complex and
difficult to analyse mathematically
 Evaluation is usually through measurement of the
algorithm’s performance on test images
 Use of a range of images to establish performance
envelope
 Comparison with existing algorithms
 Performance on degraded (noise-added) images
(robustness)
 Sensitivity to algorithm parameter settings
Test images
 Real images
 ‘Ground truth’ difficult to establish
 Pseudo-real images
 Could be synthetic objects moving against real
background
 Often a good compromise
 Synthetic images
 Noise and illumination variation over object surfaces
hard to model realistically
Simple synthetic images
 Simple ‘object-background’ synthetic images used to
evaluate thresholding and segmentation algorithms
 They obey a very simple image model (piecewise
constant + Gaussian noise)
 Unrealistic in practice – images are not like this!
Simple synthetic images
Zero noise
Low noise
Medium noise
Pseudo-real images
 More realistic object background images are better
used to evaluate segmentation algorithms
 Images of natural objects in natural illumination
 Ground truth can be established using hand
segmentation tools (such as built into many image
processing packages)
Pseudo-real images
Screws
Washers
Keys
Cars
Simple synthetic edges
 Again, piecewise constant + Gaussian noise image
model
 ‘Ideal’ step edge
 Precise edge location but not achievable by finite
aperture imaging systems
Simple synthetic edges
Low noise
Medium noise
High noise
Pseudo-real edges
 More realistic edge profiles can be created by
smoothing an ideal step edge
*
Step edge
=
Gaussian filter
Pseudo-real movies
 The ‘yosemite’ sequence is a computer
generated movie of a rendering of a fly-through
the Yosemite valley
 Background clouds are real
 Enables true flow (ground truth) to be
determined
 Used extensively in the evaluation of optical
flow algorithms
 yosemite.avi
 yosemite_flow.avi
Noise
 Often used to evaluate the ‘robustness’ of
algorithms
 Additive noise usual in optical images but
multiplicative is more realistic in sonar/radar
images
 Noise level proportional to signal level
 Usual noise model is independent random
variables (usually Gaussian)
 Correlated noise often more realistic
Noise
 Standard noise model is zero-mean identical
independently distributed (iid) Gaussian (normal)
random variables
 Characterised by variance
2
 Probability distribution of rv’s
p( x)  exp(  x / 2 )
2
2
Noise
 Noise level characterised by the signal-to-noise ratio
 Usually expressed in dB’s
 Defined as :
2
S / N  10 log 10 S 2 /  2
 S is the mean-square grey level defined (for a
pixel image) as
1
S 
W H
2
 g ( x, y )
x, y
2
W H
Noise
 dB
30dB
0dB
Noise (mean-square error)
 We can regard the mean-square error (difference)
between 2 images as noise
 Often used to evaluate image compression algorithms in
comparing the original and decompressed images
 Image differences can also be expressed as the peaksignal-to-noise-ratio (PSNR) in dB by taking the signal
level as 255
Noise (mean-square error)
1
mse 
W H
 g ( x, y)  gˆ ( x, y)
2
x, y
PSNR  10 log 10 2552 / mse dB
Other types of noise
 The other main category of (additive) noise is impulse
(sometimes called ‘salt and pepper’) noise
 Characterised by the impulse rate (spatial density of
noise impulses) and mean square amplitude of impulse
 Can normally be easily filtered out using median filters
Other types of noise
Original
Salt and pepper noise De-speckled
Other types of noise
 There are many other types of noise which can be
considered in algorithm evaluation
 Essentially more sophisticated and realistic probability
distributions of noise rv’s

For example a ‘generalised’ Gaussian model is often considered to
model ‘heavy’ tailed distributions
 However, in my humble opinion, a more realistic
source of noise is the deviation away from the
‘ideal’ of the illumination variation across object
surfaces
Other types of noise
Other types of noise
Evaluation of thresholding & segmentation
methods
 Segmentation and thresholding algorithms
essentially group pixels into regions (or classes)
 Simplest case is object/background
 Simple evaluation metrics just quantify the
number of miss-classified pixels
 For basic images models such as constant greylevel in
object/background regions plus iid Gaussian noise, the
probability of error can be computed analytically
Evaluation of thresholding & segmentation
methods
 For a simple object/background image :
Prob(Pixel is an object pixel)  Po
Prob(Pixel is a background pixel)  Pb
Prob(Miss - classifyin g object pixels)  p(b | o)
Prob(Miss - classifyin g background pixels)  p(o | b)
Prob(Miss - classifyin g a pixel)  Po p(b | o)  Pb p(o | b)
 Pmiss (T )
Evaluation of thresholding & segmentation
methods
 Miss-classification probability is a function of a
threshold T
 For a simple constant region greylevel model plus
additive iid Gaussian noise we can easily derive an
analytical expression for Pmiss (T )
 Not very useful in practice as limited image model and
we also require the ground truth
 More useful just to simply measure the missclassification error as a function of threshold
Evaluation of thresholding & segmentation
methods
 Usual to represent correct classification probabilities
and false alarm probabilities jointly within a receiver
operating curve (ROC)
 For example, the ROC shows how these vary as a
function of threshold for an object/background
classification
Evaluation of thresholding & segmentation
methods
1.0
T=0
g ( x, y )  T  background
g ( x, y )  T  object
Prob. of correct
classification
0.0
T=255
0.0
Prob. of false alarm
1.0
Evaluation of thresholding & segmentation
methods
 More useful methods of evaluation can be found by
taking account of the application of the segmentation
 Segmentation is rarely an end in itself but a component
in an overall machine vision system
 Also, the level of under- or over- segmentation of an
algorithm needs to be determined
Evaluation of thresholding &
segmentation methods
Ground truth
Under-segmentation
Over-segmentation
Evaluation of thresholding & segmentation
methods
 Under-segmentation is bad as distinct regions are
merged
 Over-segmentation can be acceptable as sub-regions
comprising a single ground truth region can be
merged using ‘high’ level knowledge
 Also, the level of over-segmentation can be controlled
by parameter settings of the algorithm
Evaluation of thresholding & segmentation
methods
 A possible segmentation metric is to quantify correctly
detected regions, over-segmentation and undersegmentation
 Depends upon some threshold setting T
 Region rather than pixel based
 Used in Koester and Spann’s paper (IEEE Trans. PAMI,
2000) to evaluate range image segmentations
Evaluation of thresholding & segmentation
methods
 Correct detection
 At least T % of the pixels in region k of the
segmented image are marked as pixels in region j of
the ground truth image
 And vice versa
GT image
Segmentation
Evaluation of thresholding & segmentation
methods
 Over-segmentation
 Region j in the ground truth image corresponds to
regions k1, k2… km in the segmented image if :


At least T % of the pixels in region ki are marked as pixels of
region j
At least T % of the pixels in region j are marked as pixels in the
union of regions k1, k2… km
Evaluation of thresholding & segmentation
methods
GT image
Segmentation
Evaluation of thresholding & segmentation
methods
 Under-segmentation
 Regions j1, j2… jm in the ground truth image correspond
to region k in the segmented image if :


At least T % of the pixels in region k are marked as pixels in
the union of regions j1, j2… jm
At least T % of the pixels in region ji are marked as pixels in
region k
Evaluation of thresholding & segmentation
methods
GT image
Segmentation
Evaluation of thresholding & segmentation
methods
 The metric also allows us to quantify missed and
noise regions
 Missed regions – regions in the ground truth image not
found in the segmented image
 Noise regions – regions in the segmented image not
found in the ground truth image
 Overall, the average number of correct, over,
under, missed and noise regions can be quantified
over an image database and different algorithms
compared
Evaluation of optical flow methods
 Optical flow algorithms compute the 2D optical flow
vector at each pixel using consecutive frames in a video
sequence
 Optical flow algorithms are notoriously un-robust
 Crucial to evaluate the effectiveness of any method used
(or any new method devised)
 Usually ground truth difficult to come by
Evaluation of optical flow methods
Ground truth flow  u( x, y )  u ( x, y ), v( x, y ) 
Flow estimate  uˆ ( x, y )  uˆ ( x, y ), vˆ( x, y ) 
Asolute flow error   ( x, y )  u( x, y )  uˆ ( x, y )
Average error   
 ( x, y)
x, y
N
Evaluation of optical flow methods
 This simple error measurement naturally amplifies
errors when the flow vectors are large (for the same
relative flow error)
 Can normalize the error by the product of the
magnitudes of the ground truth flow and flow estimate
Evaluation of optical flow methods
 Often the ground truth is not available
 A useful (but often crude) way of comparing the quality
of two optical flow fields u1 ( x, y) and u 2 ( x, y) is to
compute the displaced frame difference (DFD) statistic
 Uses the two consecutive frames of a sequence from
which the flows were computed
Evaluation of optical flow methods
DFD (t ) 
1
f ( x, y, t )  f ( x  u ( x, y ), y  v( x, y ), t  1)

N x, y
Evaluation of optical flow methods
 DFD is a crude estimate because it says nothing
about the accuracy of the motion field directly –
just the quality of the pixel mapping from one
frame to the next
 Plus it says nothing about the confidence attached to
optical flow estimates
 However, it is the basis of motion compensation
algorithms for most of the current video
compression standards (MPEG, H261 etc)
Evaluation of optical flow methods
 In optical flow estimation, as in other types of
estimation algorithms, we are often interested in
the quality of the estimates
 In classic estimation theory, we often compute
confidence limits on estimates

We can say with a certain degree of confidence (say 90%) that the
parameter lies within certain bounds
 We usually assume that the quantities we are estimating
follow some known probability distribution (for example
chi-squared)
Evaluation of optical flow methods
 In the case of optical flow vectors, confidence
regions are ellipses in 2 dimensions
 They essentially characterise the distribution of the
estimation error
 Assuming a normal distribution of the flow error,
confidence ellipses can be drawn for any confidence
limit


Orientation and shape of ellipses determined by the covariance
matrix defining the normal distribution
The eigenvalues of the covariance matrix define a particular
confidence limit
Evaluation of optical flow methods
99%
90%
70%
Confidence ellipses
of u( x, y )  uˆ ( x, y )
Evaluation of optical flow methods
Yosemite
Yosemite flow
(L&K)
Yosemite
true flow
Yosemite flow
(L&K)
confidence
thresholded
Conclusions


Evaluation in computer vision is a difficult and
often controversial topic
I would suggest 3 rules of thumb to consider
when evaluating your work for the purposes of
assignments
1)
2)
3)
Consider carefully your test data. Make it as realistic
as possible
Make your evaluations as much as possible
‘application driven’
Make your algorithms ‘self evaluating’ if possible
through the use of confidence statistics