Transcript corners
Learning to Perceive
Transparency from the
Statistics of Natural Scenes
Anat Levin
School of Computer Science and Engineering
The Hebrew University of Jerusalem
Joint work with Assaf Zomet and Yair Weiss
Transparency
two layers : I I 1 I 2
one layer : I I 1 0
How does our visual system choose the right decomposition?
I ( x, y) I ( x, y) I ( x, y)
1
2
•Why not “simpler” one layer solution?
•Which two layers out of infinite possibilities?
Talk Outlines
•Motivation and previous work
•Our approach
•Results and future work
Transparency in the real world
“Fashion Planet's photographers
have spent the last five years
working to bring you clean
photographs of the windows on
New York especially without the
reflections that usually occur in
such photography”
http://www.fashionplanet.com/sept98/features/reflections/home
Transparency and shading
I ( x, y) I 1 ( x, y) I 2 ( x, y)
I ( x, y ) L ( x, y ) R ( x, y )
Transparency in human vision
One layer
Two layers
• Metelli's conditions (Metelli 74)
•T-junctions, X-junctions, doubly reversing junctions
(Adelson and Anandan 90, Anderson 99)
Not obvious how to apply “junction catalogs” to real images.
Transparency from multiple frames
•Two frames with polarizer using ICA (Farid and
Adelson 99, Zibulevsky 02)
•Multiple frames with specific motions (Irani et al. 94,
Szeliski et al. 00, Weiss 01)
Shading from a single frame
I ( x, y) I 1 ( x, y) I 2 ( x, y)
I ( x, y ) L ( x, y ) R ( x, y )
•Retinex (Land and McCann 71).
•Color (Drew, Finlayson Hordley 02)
•Learning approach (Tappen, Freeman Adelson 02)
Talk Outlines
•Motivation and previous work
•Our approach
•Results and future work
Our Approach
I ( x, y) I 1 ( x, y) I 2 ( x, y)
Ill-posed problem.
Assume probability distribution Pr(I1), Pr(I2)
and search for most probable solution.
(ICA with a single microphone)
Statistics of natural scenes
Input image
dx histogram
dx Log histogram
Statistics of derivative filters
0
Log Probability
Gaussian –x2
–x
Laplacian –|x|
1/2
-1
Log histogram
Generalized Gaussian distribution (Mallat 89, Simoncelli 95)
p ( x) e
x / s
, 1
Is sparsity enough?
=
+
=
+
Or:
Is sparsity enough?
=
+
=
+
Or:
Exactly the same derivatives exist in the single layer
solution as in the two layers solution.
Beyond sparseness
• Higher order statistics of filter outputs (e.g. Portilla
and Simoncelli 2000).
•Marginals of more complicated feature detectors
(e.g. Zhu and Mumford 97, Della Pietra Della Pietra
and Lafferty 96).
Corners and transparency
=
+
•In typical images, edges are sparse.
•Adding typical images is expected to increase the
number of corners.
•Not true for white noise
Harris-like operator
I x2 ( x, y )
I x ( x, y ) I y ( x, y )
c( x0 , y0 ) det w( x, y )
2
I
(
x
,
y
)
I
(
x
,
y
)
I
(
x
,
y
)
x
y
y
Corner histograms
Derivative Filter
Corner Operator
Fitting:
Typical exponents for
natural images:
p( x) e
x / s1
p ( x) e
x / s2
Derivative Filter
Corner Operator
0.7
0.2
s1 / s2 1
Simple prior for transparency prediction
The probability of a decomposition
I I1 I 2
log P( I x , I y ) log P( I , I ) log P( I , I )
1
x
1
y
2
x
2
y
log P( x, y) log Z I ( x, y) c( x, y)
x, y
0.7, 0.2, s1 / s2 1
Does this predict transparency?
log P( x, y) log Z I ( x, y) c( x, y)
x, y
I
I1
I
I1
How important are the statistics?
log P( x, y) log Z I ( x, y) c( x, y)
x, y
0.7, 0.2, 1
Is it important that the statistics are non
Gaussian? Would any cost that penalized
high gradients and corners work?
The importance of being non Gaussian
log P( x, y ) log Z I ( x, y ) c( x, y )
x, y
I
I1
0.7, 0.2
I
I1
2, 2
The “scalar transparency” problem
a b 1,
with a 0, b 0
Consider a prior over positive scalars
p( x) e
For which priors is the MAP solution sparse?
x
The “scalar transparency” problem
a b 1,
with a 0, b 0
Observation:
The MAP solution is obtained with a=0, b=1 or a=1,
b=0 if and only if f(x)=log P(x) is convex.
f(
f (a ) f (b)
2
f(
ab
)
2
0
0.5
1
MAP solution: a=0, b=1
ab
)
2
f (a ) f (b)
2
0
0.5
1
MAP solution: a=0.5, b=0.5
The importance of being non Gaussian
0
I
0.5
1
I1
0.7, 0.2
0
I
0.5
1
I1
2, 2
log P( x, y ) log Z I ( x, y ) c( x, y )
x, y
I
I1
I
I1
Can we perform a global optimization?
Conversion to discrete MRF
g1
g2
g3
g4
g5
g6
g7
g8
g9
g10
g11
g12
g13
g14
g15
For the decomposition:
I I1 I 2,
g i ( I ix1 , I iy1 ), I1 gradient at location i
f i ( I ix , I iy ) g i
Local Potential- derivative filters:
i ( gi ) e
gi fi
Pairwise Potential- pairwise approximation to the corner operator:
i, j ( gi , g j ) e
i , j ,k ( g i , g j , g k )
det( g i g iT g j g Tj ) det( f i f iT f j f jT )
-Enforcing integrability
Conversion to discrete MRF
For the decomposition:
I I1 I 2,
gi ( I ix1 , I iy1 ), discretiza tion of I1 gradient at location i. fi ( I ix , I iy ) gi
1
P( g ) i ( gi ) i , j ( gi , g j ) i , j ,k ( gi , g j , g k )
Z i
i, j
i , j ,k
Local Potential- derivative filters:
i ( gi ) e
gi fi
Pairwise Potential- pairwise approximation to the corner operator:
i, j ( gi , g j ) e
i , j ,k ( g i , g j , g k )
det( g i g iT g j g Tj ) det( f i f iT f j f jT )
-Integrability enforcing
Optimizing discrete MRF
I I1 I 2,
g i ( I ix1 , I iy1 ), discretiza tion of I1 gradient at location i.
f i ( I ix , I iy ) g i , discretiza tion of I 2 gradient at location i.
1
P( g ) i ( gi ) i , j ( gi , g j ) i , j ,k ( gi , g j , g k )
Z i
i, j
i , j ,k
g
N
possible assignments.
Solution: use max-product belief propagation.
The MRF has many cycles but BP works in similar problems (Freeman and
Pasztor 99, Frey et al 2001. Sun et al 2002).
Converges to strong local minimum (Weiss and Freeman 2001)
Drawbacks of BP for this problem
•Large memory and time complexity.
•Convergence depends on update order.
•Discretization artifacts
Talk Outlines
•Motivation and previous work
•Our approach
•Results and future work
Results
input
Output layer 1
Output layer 2
Results
input
Output layer 1
Output layer 2
Future Work
•Dealing with a more complex texture
+
Original
=
Non linear filter
Future Work
•Dealing with a more complex texture:
•A coarse qualitative separation.
•Learn discriminative features automatically
•Applying other optimization methods.
•Extend to shading and illumination.
•Use application specific priors (e.g. Manhattan World)
Conclusions
•Natural scene statistics predict perception of
transparency.
•First algorithm that can decompose a single image into
the sum of two images.