EE2E1. JAVA Programming
Download
Report
Transcript EE2E1. JAVA Programming
EE4H, M.Sc 0407191
Computer Vision
Dr. Mike Spann
[email protected]
http://www.eee.bham.ac.uk/spannm
Contents
Why evaluate?
Images – synthetic/natural?
Noise
Example 1. Evaluation of
thresholding/segmentation methods
Example 2. Evaluation of optical flow methods
Why evaluate?
Computer vision algorithms are complex and
difficult to analyse mathematically
Evaluation is usually through measurement of the
algorithm’s performance on test images
Use of a range of images to establish performance
envelope
Comparison with existing algorithms
Performance on degraded (noise-added) images
(robustness)
Sensitivity to algorithm parameter settings
Test images
Real images
‘Ground truth’ difficult to establish
Pseudo-real images
Could be synthetic objects moving against real
background
Often a good compromise
Synthetic images
Noise and illumination variation over object surfaces
hard to model realistically
Simple synthetic images
Simple ‘object-background’ synthetic images used to
evaluate thresholding and segmentation algorithms
They obey a very simple image model (piecewise
constant + Gaussian noise)
Unrealistic in practice – images are not like this!
Simple synthetic images
Zero noise
Low noise
Medium noise
Pseudo-real images
More realistic object background images are better
used to evaluate segmentation algorithms
Images of natural objects in natural illumination
Ground truth can be established using hand
segmentation tools (such as built into many image
processing packages)
Pseudo-real images
Screws
Washers
Keys
Cars
Simple synthetic edges
Again, piecewise constant + Gaussian noise image
model
‘Ideal’ step edge
Precise edge location but not achievable by finite
aperture imaging systems
Simple synthetic edges
Low noise
Medium noise
High noise
Pseudo-real edges
More realistic edge profiles can be created by
smoothing an ideal step edge
*
Step edge
=
Gaussian filter
Pseudo-real movies
The ‘yosemite’ sequence is a computer
generated movie of a rendering of a fly-through
the Yosemite valley
Background clouds are real
Enables true flow (ground truth) to be
determined
Used extensively in the evaluation of optical
flow algorithms
yosemite.avi
yosemite_flow.avi
Noise
Often used to evaluate the ‘robustness’ of
algorithms
Additive noise usual in optical images but
multiplicative is more realistic in sonar/radar
images
Noise level proportional to signal level
Usual noise model is independent random
variables (usually Gaussian)
Correlated noise often more realistic
Noise
Standard noise model is zero-mean identical
independently distributed (iid) Gaussian (normal)
random variables
Characterised by variance
2
Probability distribution of rv’s
p( x) exp( x / 2 )
2
2
Noise
Noise level characterised by the signal-to-noise ratio
Usually expressed in dB’s
Defined as :
2
S / N 10 log 10 S 2 / 2
S is the mean-square grey level defined (for a
pixel image) as
1
S
W H
2
g ( x, y )
x, y
2
W H
Noise
dB
30dB
0dB
Noise (mean-square error)
We can regard the mean-square error (difference)
between 2 images as noise
Often used to evaluate image compression algorithms in
comparing the original and decompressed images
Image differences can also be expressed as the peaksignal-to-noise-ratio (PSNR) in dB by taking the signal
level as 255
Noise (mean-square error)
1
mse
W H
g ( x, y) gˆ ( x, y)
2
x, y
PSNR 10 log 10 2552 / mse dB
Other types of noise
The other main category of (additive) noise is impulse
(sometimes called ‘salt and pepper’) noise
Characterised by the impulse rate (spatial density of
noise impulses) and mean square amplitude of impulse
Can normally be easily filtered out using median filters
Other types of noise
Original
Salt and pepper noise De-speckled
Other types of noise
There are many other types of noise which can be
considered in algorithm evaluation
Essentially more sophisticated and realistic probability
distributions of noise rv’s
For example a ‘generalised’ Gaussian model is often considered to
model ‘heavy’ tailed distributions
However, in my humble opinion, a more realistic
source of noise is the deviation away from the
‘ideal’ of the illumination variation across object
surfaces
Other types of noise
Other types of noise
Evaluation of thresholding & segmentation
methods
Segmentation and thresholding algorithms
essentially group pixels into regions (or classes)
Simplest case is object/background
Simple evaluation metrics just quantify the
number of miss-classified pixels
For basic images models such as constant greylevel in
object/background regions plus iid Gaussian noise, the
probability of error can be computed analytically
Evaluation of thresholding & segmentation
methods
For a simple object/background image :
Prob(Pixel is an object pixel) Po
Prob(Pixel is a background pixel) Pb
Prob(Miss - classifyin g object pixels) p(b | o)
Prob(Miss - classifyin g background pixels) p(o | b)
Prob(Miss - classifyin g a pixel) Po p(b | o) Pb p(o | b)
Pmiss (T )
Evaluation of thresholding & segmentation
methods
Miss-classification probability is a function of a
threshold T
For a simple constant region greylevel model plus
additive iid Gaussian noise we can easily derive an
analytical expression for Pmiss (T )
Not very useful in practice as limited image model and
we also require the ground truth
More useful just to simply measure the missclassification error as a function of threshold
Evaluation of thresholding & segmentation
methods
Usual to represent correct classification probabilities
and false alarm probabilities jointly within a receiver
operating curve (ROC)
For example, the ROC shows how these vary as a
function of threshold for an object/background
classification
Evaluation of thresholding & segmentation
methods
1.0
T=0
g ( x, y ) T background
g ( x, y ) T object
Prob. of correct
classification
0.0
T=255
0.0
Prob. of false alarm
1.0
Evaluation of thresholding & segmentation
methods
More useful methods of evaluation can be found by
taking account of the application of the segmentation
Segmentation is rarely an end in itself but a component
in an overall machine vision system
Also, the level of under- or over- segmentation of an
algorithm needs to be determined
Evaluation of thresholding &
segmentation methods
Ground truth
Under-segmentation
Over-segmentation
Evaluation of thresholding & segmentation
methods
Under-segmentation is bad as distinct regions are
merged
Over-segmentation can be acceptable as sub-regions
comprising a single ground truth region can be
merged using ‘high’ level knowledge
Also, the level of over-segmentation can be controlled
by parameter settings of the algorithm
Evaluation of thresholding & segmentation
methods
A possible segmentation metric is to quantify correctly
detected regions, over-segmentation and undersegmentation
Depends upon some threshold setting T
Region rather than pixel based
Used in Koester and Spann’s paper (IEEE Trans. PAMI,
2000) to evaluate range image segmentations
Evaluation of thresholding & segmentation
methods
Correct detection
At least T % of the pixels in region k of the
segmented image are marked as pixels in region j of
the ground truth image
And vice versa
GT image
Segmentation
Evaluation of thresholding & segmentation
methods
Over-segmentation
Region j in the ground truth image corresponds to
regions k1, k2… km in the segmented image if :
At least T % of the pixels in region ki are marked as pixels of
region j
At least T % of the pixels in region j are marked as pixels in the
union of regions k1, k2… km
Evaluation of thresholding & segmentation
methods
GT image
Segmentation
Evaluation of thresholding & segmentation
methods
Under-segmentation
Regions j1, j2… jm in the ground truth image correspond
to region k in the segmented image if :
At least T % of the pixels in region k are marked as pixels in
the union of regions j1, j2… jm
At least T % of the pixels in region ji are marked as pixels in
region k
Evaluation of thresholding & segmentation
methods
GT image
Segmentation
Evaluation of thresholding & segmentation
methods
The metric also allows us to quantify missed and
noise regions
Missed regions – regions in the ground truth image not
found in the segmented image
Noise regions – regions in the segmented image not
found in the ground truth image
Overall, the average number of correct, over,
under, missed and noise regions can be quantified
over an image database and different algorithms
compared
Evaluation of optical flow methods
Optical flow algorithms compute the 2D optical flow
vector at each pixel using consecutive frames in a video
sequence
Optical flow algorithms are notoriously un-robust
Crucial to evaluate the effectiveness of any method used
(or any new method devised)
Usually ground truth difficult to come by
Evaluation of optical flow methods
Ground truth flow u( x, y ) u ( x, y ), v( x, y )
Flow estimate uˆ ( x, y ) uˆ ( x, y ), vˆ( x, y )
Asolute flow error ( x, y ) u( x, y ) uˆ ( x, y )
Average error
( x, y)
x, y
N
Evaluation of optical flow methods
This simple error measurement naturally amplifies
errors when the flow vectors are large (for the same
relative flow error)
Can normalize the error by the product of the
magnitudes of the ground truth flow and flow estimate
Evaluation of optical flow methods
Often the ground truth is not available
A useful (but often crude) way of comparing the quality
of two optical flow fields u1 ( x, y) and u 2 ( x, y) is to
compute the displaced frame difference (DFD) statistic
Uses the two consecutive frames of a sequence from
which the flows were computed
Evaluation of optical flow methods
DFD (t )
1
f ( x, y, t ) f ( x u ( x, y ), y v( x, y ), t 1)
N x, y
Evaluation of optical flow methods
DFD is a crude estimate because it says nothing
about the accuracy of the motion field directly –
just the quality of the pixel mapping from one
frame to the next
Plus it says nothing about the confidence attached to
optical flow estimates
However, it is the basis of motion compensation
algorithms for most of the current video
compression standards (MPEG, H261 etc)
Evaluation of optical flow methods
In optical flow estimation, as in other types of
estimation algorithms, we are often interested in
the quality of the estimates
In classic estimation theory, we often compute
confidence limits on estimates
We can say with a certain degree of confidence (say 90%) that the
parameter lies within certain bounds
We usually assume that the quantities we are estimating
follow some known probability distribution (for example
chi-squared)
Evaluation of optical flow methods
In the case of optical flow vectors, confidence
regions are ellipses in 2 dimensions
They essentially characterise the distribution of the
estimation error
Assuming a normal distribution of the flow error,
confidence ellipses can be drawn for any confidence
limit
Orientation and shape of ellipses determined by the covariance
matrix defining the normal distribution
The eigenvalues of the covariance matrix define a particular
confidence limit
Evaluation of optical flow methods
99%
90%
70%
Confidence ellipses
of u( x, y ) uˆ ( x, y )
Evaluation of optical flow methods
Yosemite
Yosemite flow
(L&K)
Yosemite
true flow
Yosemite flow
(L&K)
confidence
thresholded
Conclusions
Evaluation in computer vision is a difficult and
often controversial topic
I would suggest 3 rules of thumb to consider
when evaluating your work for the purposes of
assignments
1)
2)
3)
Consider carefully your test data. Make it as realistic
as possible
Make your evaluations as much as possible
‘application driven’
Make your algorithms ‘self evaluating’ if possible
through the use of confidence statistics