Foundations of ATR with 3-D Data (1)

Download Report

Transcript Foundations of ATR with 3-D Data (1)

A Mathematical Theory
of Automatic Target
Recognition
Aaron D. Lanterman
([email protected])
School of Electrical and
Computer Engineering
What Makes ATR “Harder” than
Factoring Large Numbers?
• Factoring large numbers may be NPhard, but...
• At least it’s easy to precisely specify
what the problem is!
• Not so easy in ATR
– Subject to controversy
Can You Build an Airplane Without a
Theory of Aerodynamics?
• Sure! Without aerodynamic theory,
you can do this...
• …but with a theory, you can do this!
Can You Build an Communication
Systems w/out Information Theory?
• Sure! Without Information Theory,
you can do this…
• …but with Information Theory, you
can do this!
Steam Engines and Thermodynamics
• Dick Blahut likens the situation to steam engines
coming before the science of thermodynamics
• First steam engines build by entrepreneurs and
“inventors”
– Thomas Savery: 17th and 18th centuries
– Necessity the mother of invention!
• Thermodynamics didn’t begin to crystallize until mid
19th century… but with it, you eventually get
Shannon’s Lightning Bolt
• 1948: Claude Shannon’s “A
Mathematical Theory of
Communication” (1948)
– Later renamed “The Mathematical Theory
of Communication”
• Found fundamental limits on what is
possible, i.e. channel capacity
• Before Shannon, your boss might ask you to do the
impossible, and fire you if you failed to do it!
shouldn’t
• Your boss cannot fire your for failing to exceed
channel capacity!
• You can tell your boss you need a better channel
Theory and Technology
• Advances in theory are not enough;
also need the technology
– Aerodynamic theory alone won’t get you a B-2;
need advances in materials, manufacturing
– Information theory along won’t get you cell phones;
need fast DSP chips, good batteries, even more theory (i.e.
coding theory)
• Theory tells you what’s possible, but sometimes only
hints at how to get there
– Quantum computing folks: does this sound familiar?
Info-Theoretic View of ATR
(Statistical Estimation-Theoretic)
Target Recognizer
Scene
Understanding
Scene Synthesizer
Multiple Sensors
Channel
Source
X
p( x) 
e
E ( x)
Z
Database
Decoder
Y = Y1 , Y2 , ... Ym
Xˆ
e  E ( y| x )
p ( y | x) 
Z ( y | x)
p ( y | x) p ( x)
p( x | y ) 
p( y )
Optimality Criteria
Performance
Bounds
Hypothesis testing (LRT, GLRT)
ML, Bayes, Neyman Pearson
Estimation
ML, MAP, M.M.S.E., Bayes
Miss, false alarm rate
Confusion matrices
Chernoff
Stein’s Lemma
Bias, Variance, M.S.E.
Cramer-Rao
CIS/MIM
What Makes ATR “Harder” than
Designing a Cell Phone?
• The space of X for real-world scenes is
extremely complicated
• You don’t get to pick p(x)
• Likelihood p(y|x) is difficult to formulate
– The “channel” is often deliberately hostile
• Targets hiding in clutter
• Using decoys and camouflage
• Radars can be subject to jamming
Variability in Complex Scenes
• Geometric variability
–
–
–
–
Position
Orientation
Articulation
“Fingerprint”
• Environmental variability
– Thermal variability in infrared
– Illumination variability in visual
• Complexity variability
– Number of objects not known
Ulf Grenander
• Student of Cramér (yes, that Cramér)
• PhD on statistical inference in function spaces (1950)
• “Toeplitz Forms and their Applications” (with Szegö)
– Fundamental work on spectral estimation (1958)
• “Probabilities on Algebraic Structures” (1968)
• “Tutorial on Pattern Theory” - unpublished
manuscript
– Inspired classic paper by Geman & Geman (1983)
General Pattern Theory
• Generalize standard probability,
statistics, and shape theory
• Put probability measures on complex
structures
– Biological structures
•
•
•
•
Mitochondria
Amoebas
Brains
Hippocampus
– Natural language
– Real-world scenes of interest in ATR
The 90’s GPT Renaissance
• Made possible by increases in
computer power
• Michael Miller (Washington Univ.,
now at JHU) did a sabbatical with
Grenander
• Fields Medalist David Mumford
moves from Harvard to Brown;
shifts from algebraic geometry to
pattern theory
Composite Parameter Spaces

Xk    SO(3)  Types
3

k
• Naturally handles obscuration
• Don’t know how many targets are in the scene in advance


x  X     SO(3)  Types
3
k 0
• Move away from thinking of detection, location,
recognition, etc. as separate problems

k
Applying the Grenander Program (1)
• Take a Bayesian approach
• Many ATR algorithms seek features that are invariant to
pose (position and orientation)
• Grenander’s Pattern Theory treats pose as nuisance
variable in the ATR problem, and deals with it head on
– Co-estimate pose, or integrate it out
– At a given viewing angle, Target A at one orientation may look
much like Target B at a different orientation
– “…the nuisance parameter of orientation estimation plays a
fundamental role in determining the bound on recognition” Grenander, Miller, & Srivastava
U. Grenander, M.I. Miller, and A. Srivastava, “Hilbert-Schmidt Lower Bounds for Estimators on Matrix Lie Groups for
ATR,” IEEE Trans. PAMI, Vol. 20, No. 2, Aug. 1998, pp. 790-802.
Applying the Grenander Program (2)
• Develop statistical likelihood
• Data fusion is natural
L  LLADAR  LIR  LMW  LGUYWITHBINOCULARS
• At first, use as much of the data as possible
– Be wary of preprocessing: edge extraction, segmentation etc.
– Processing can never add information
• Data processing inequality from information theory
I (data; parameters)  I ( f (data); parameters)
• If you need to extract features, i.e. for real-time
computational tractability, try to avoid as much loss of
information as possible
Analytic Performance Bounds
• Estimation bounds on continuous parameters
– Cramér-Rao bounds for continuous pose parameters
– Hilbert-Schmidt metrics for orientation parameters
• Bounds on detection/recognition probabilities
Anuj
Srivastava
– Stein’s Lemma, Chernoff bounds
– Asymptotic analysis to approximate probabilities of error
– Performance in a binary test is dominated by a term exponential
in a distance measure between a “true” and an “alternate” target
• Adjust pose of “alternate” target to get closest match to “true” target
as seen by the sensor system
– Secondary term involving CRB on nuisance parameters
• Links pose estimation and recognition performance
U. Grenander, A. Srivastava, and M.I. Miller, “Asymptotic Performance Analysis of Bayesian Target Recognition,”
IEEE Trans. Info. Theory, Vol. 46, No. 4, July 2000, pp. 1658-1665.
Reading One of DARPA’s BAAs…
• DARPA’s E3D program seeks:
– “efficient techniques for rapidly exploiting 3-D sensor data
to precisely locate and recognize targets.”
• BAA full of demands (hopes?) for different stages
of the program, such as:
– “The Target Acquisition and Recognition technology areas
will develop techniques to locate and recognize
articulating, reconfigurable targets under partial
obscuration conditions, with an identification probability of
0.85%, a target rejection rate less than 5%, and a
processing time of 3 minutes per target or less”
…Leads Us to Wondering
• If such a milestone is not reached,
is that the fault of the algorithm or the sensor?
– How does the DARPA Program Manager know who to
fire?
– Without a theory, the DARPA PM may fire someone
who was asked to “exceed channel capacity,” i.e.
given an impossible task
• What performance from a particular sensor is
necessary to achieve a certain level of ATR
performance,
independent of the question of what
algorithm is used?
Perspective Projection

Xk    SO(2)  Types
2

k
Sensor Effects
Optical PSF
Poisson
Photocounting
Noise
Dead and
Saturated
Pixels
Loglikelihood
• CCD loglikelihood of Snyder et. al
LCCD (y | )   (i)   y(i)ln (i)
i
where
i
( j)  psf(i | j)( j)
j
• Cascade with render : x  
L(y | x)  LCCD(y | render( x))
• Sensor
 fusion natural; just add
loglikelihoods

Langevin Diffusion Processes
• Write posterior in Gibbs form:
 ( x)  exp{ E ( x)}
• Fix number of targets and target types
• Simulate Langevin diffusion:
dX N ( )   X N {E ( X N ( ))}  dWN ( )
• Distribution of
 
X N ( ) 
  N ( x N )
• Computed desired statistics from the samples
• Generalizes to non-Euclidean groups like rotations
• Gradient computation
– Numeric approximations
– Easy and fast on modern 3-D graphics hardware
Jump Processes
Birth
Death
Type-change
Jump Strategies
• Gibbs style
–Sample from a restricted part
of the posterior
• Metropolis-Hastings style
–Draw a “proposal” from a
“proposal density”
–Accept (or reject) the proposal
with a certain probability
Example Jump-Diffusion Process
Thermal Variability
Simulations from PRISM:
Discretizes target surface using
regions from CAD template and
internal heat transfer model
Average Static State
Average Dynamic State
CIS/MIM
Can’t Hide from Thermal Variations
Profile 8
Profile 45
Performance Variations
Due To Thermodynamic Variability
CIS/MIM
Profile 75
Profile 140
Performance Loss Due To
Inaccurate Thermodynamic Information
Cooper, Miller SPIE 97
Principle Component Representation
of Thermal State
• Model radiance as scalar random field on surface
• Compute empirical mean & covariance from
database of 2000 radiance profiles
• Karhunen-Loeve expansion using eigenfunctions
of covariance on surface - “Eigentanks”
• Add expansion coefficients to parameter space
– Fortunately, able to estimate directly given pose
Matt Cooper
(now with Xerox)
CIS/MIM
A younger, much
thinner Aaron
Lanterman
SPIE 97 Cooper, Grenander, Miller, Srivastava
The First “Eigentanks”
Meteorological Variation
Operational Variation
Remember,
we’re
showing 2-D
views of
full 3-D surfaces
Composite Mode of Variation
CIS/MIM
SPIE 97 Cooper, Grenander,
Miller, Srivastava
Joint MAP Est. of Pose and Thermal Signature
Real
NVESD
M60 data
(courtesy
James
Ratches)
Initial
Estimate
Final
Estimate
CIS/MIM
SPIE 98 Cooper and Miller
“Cost” of Estimating Thermal State
MSE Performance Loss
Comanche SNR = 5.08 dB
CIS/MIM
Ladar/IR
Sensor
Fusion
MSE Performance Bound
Information Bound
Tom Green
Joe Kostakis
Jeff Shapiro
FLIR
(intensity)
CIS/MIM
LADAR
(range)
LADAR & IR Sensor Fusion
HSB Performance Curve - 9 deg error
HSB Performance Curve - 15 deg error
-5
-10
-12
-10
-14
-16
-15
-18
-20
-20
-22
-24
-25
-26
MMSE=0.05
-30
9
10
11
12
MMSE=0.05
13
14
15
16
17
18
19
Ladar CNR (dB)
LADAR/FLIR Hannon Curve
9 degrees error
CIS/MIM
20
-28
10
11
12
13
14
15
16
17
18
Ladar CNR (dB)
LADAR/FLIR Hannon Curve
15 degrees error
SPIE 98 Advanced Techniques ATR III
Kostakis, Cooper, Green, Miller,
OSullivan, Shapiro Snyder
Target Models
Panzer II
Light Tank
Hull Length: 4.81 m
Width: 2.28 m
Height: 2.15 m
Sturmgeschultz III
Self-Propelled Gun
Hull Length: 6.77 m
Width: 2.95 m
Height: 2.16 m
Semovente M41
Self-Propelled Gun
M48 A3
Main Battle Tank
Hull Length: 5.205 m
Width: 2.2 m
Height: 2.15 m
Hull Length: 6.419 m
Width: 3.63 m
Height: 3.086 m
(Info and Top Row of Images from 3-D Ladar Challenge Problem Slides by Jacobs Sverdrup)
CR-Bound on Orientation
Position assumed known
We take a
performance hit!
Strum
Semo
Interesting knee at
0.2 meters
Position unknown,
must be
co-estimated
M48 vs. Others
M48 and Panzer have dissimilar
signatures; most easily distinguished
M48 and Semo have similar
signatures; most easily confused
Semovente vs. Others
At higher resolutions,
Semo and M48 have most dissimilar
signatures; most easily distinguished
(perhaps there are nice features which only
become apparent at higher resolutions?)
At lower resolutions,
Semo and Panzer have most
dissimilar signatures; most easily
distinguished
Semo and Sturm have similar
signatures; most easily confused
Synthetic Aperture
Radar
Joseph Michael
O’Sullivan DeVore
T72
• MSTAR Data Set
• Conditionally Gaussian model for
pixel values with variances trained
from data
• Likelihood based classification
BMP 2
• Target orientation unknown and
uniformly distributed over 360° of
azimuth
• Joint orientation estimation and
target classification
• Train on 17° depression angle
CIS/MIM SAR Images
Variance Images
• Test on 15° depression angle
HS Orientation Error for 72 Windows of 10 Degrees at 80x80
0.1
12.8
0.09
12.1
0.08
11.4
0.07
10.7
0.06
9.93
0.05
9.06
0.04
8.10
0.03
7.02
0.02
5.73
0.01
4.05
• Results using 72 variance
images per target of 10° each,
and using 80 x 80 pixel subimages to reduce background
clutter
• Probability of correct
classification: 98%
• Average orientation error: < 10°
0
2S1
BMP 2 BRDM 2 BTR 60 BTR 70
2S1
BMP 2
BRDM 2
BTR 60
BTR 70
D7
T 62
T 72
ZIL131
ZSU 23 4
2S1
BMP 2
265
0
2
1
2
2
2
0
2
0
0
576
0
0
2
0
0
1
0
0
D7
T62
T 72
ZIL131 ZSU 23 4
BRDM 2 BTR 60 BTR 70
5
6
259
1
0
0
0
0
0
0
0
4
0
193
1
0
0
0
0
0
0
0
0
0
191
0
0
0
0
0
Orientation
MSE effects ID!
0
D7
T62
T 72
ZIL131
ZSU 23 4
0
0
1
0
0
271
0
0
0
3
4
0
1
0
0
1
265
4
1
0
0
1
0
0
0
0
4
577
0
1
0
0
0
0
0
0
2
0
271
0
0
0
0
0
0
0
0
0
0
270
CIS/MIM
Supported by ARO Center for Imaging Science DAAH 04-95-1-04-94 and ONR MURI N00014-98-1-06-06
Caveat
Do not confuse
the model with
reality.
Where Should Clutter Go? (1)
data  f (render ( param), " noise" )
A “forward model,” i.e.
a “scene simulator”
non-Gaussian
minimax entropy
texture models by
Song Chun Zhu
• A forest might go well in the “noise” part…
Where Should Clutter Go? (2)
• …but downtown Baghdad will not “whiten”
• Structured clutter is the most vexing
• May need to go in here, and directly
manipulate the clutter
data  f (render ( param), " noise" )
…or a bit of each
• Where to draw the line?
Acknowledgments
• Much of the work described here was
funded by the ARO Center for Imaging
Science
• Also ONR (William Miceli) and AFOSR (Jon
Sjogren)
• Slides with CIS/MIM tag were adapted from
slides provided by Michael Miller