Transcript MODVIS2012

Modeling perceptual variations by
neural decoding
Qasim Zaidi
Elias H. Cohen
NEI grants EY07556 and EY13312
Shape Constancy
•Shape is the geometrical property of an
object that is invariant to location,
rotation and scale.
The Future Building, Manhattan
•The ability to perceive the shape of a
rigid object as constant across
viewpoints
has
been
considered
essential
to
perceiving
objects
accurately.
•The visual system does not discount all
perspective distortions, so the shapes of
many 3-D objects change with viewpoint.
•Can shape constancy be expected for
rotations of the image plane?
North view
South view
(Griffiths & Zaidi, 2000)
3-D Shape Constancy across image rotations?
Convex
Concave
Vertical
Oblique
Does rotating from vertical to oblique preserve perceived depth?
Stimuli
Convex
Shapes
Perspective projection
of convex and concave
wedges (in circular
window).
Experiment 1
compared 5 vertical
shapes to 5 oblique
shapes in depth
(concave to concave &
convex to convex).
Vertical
Oblique
Texture
Sine-wave gratings
• 3 spatial frequencies, 1,3,6 cpd.
• Oriented at 90, ± 67.5, ± 45, & ± 22.5 degrees (
wrt 3D axis).
• Added in randomized phases to make 10 different
textures per shape.
Concave
Exp 1
Failures of 3-D shape constancy
Vertical vs. Oblique comparison
task.
Subjects view two shapes
sequentially.
Which shape is greater in
depth?
500 msec
500 msec
500 msec
500 msec
500 msec
Exp 1: Shape Comparison
Results
The same shape was perceived to be deeper when it was oriented
vertically than when it was oriented obliquely.
Oblique shapes were matched to vertical shapes of 0.77 times
depth of the oblique shape (S.E. = .007).
3D Shape From Texture
Perception of shape from texture depends on
patterns of orientation flows (Li & Zaidi, 2001; 2004)
Textured shape with no orientation
component orthogonal to axis of
curvature.
Origins of oblique bias for 3D shape
Is the 3D OB explained by an OB for 2D oriented components
Is there a corresponding OB for single 2D angles?
Exp 2
Failures of 2-D angle constancy
Vertical vs. Oblique comparison task.
Subjects view two shapes sequentially.
Which angle is sharper?
500 msec
500 msec
500 msec
500 msec
500 msec
Exp 2: Angle Comparison
Results
The same angle was perceived to be sharper when it was oriented
vertically than when it was oriented obliquely.
Oblique angles were matched to vertical angles 4.5 ° shallower
on average.
Predicting the 3-D depth bias from the 2-D angle bias
s vert / s oblq  a vert / a oblq
irrespective of h.
The average ratio of perceptually equivalent 2-D slopes = 0.862 (SE = .001)
Ratio of perceptually equivalent 3-D depths = 0.771 (SE = .007)
3-D depth inconstancy can be explained by anisotropy in perception of 2-D features.
Orientation anisotropies in cat V1 cells (Li et al 2003)
Ferret area 17 anisotropy (Coppolla et al, 1998)
Oriented energy in natural images
Hansen & Essock (2004)
Girshick, Landy & Simoncelli (2011)
Distribution of oriented contours in indoor (A), outdoor (B), and entirely natural (piedmont
forest) (C) environments.
Coppola D M et al. PNAS 1998;95:4002-4006
©1998 by National Academy of Sciences
Stimulus orientation decoded from cortical responses
The probability that an orientation-tuned cell will give a spike in response to an
orientation θ is determined by its tuning curve f(θ) (Sanger, 1996):
P( spike |  )  f i ( )
The probability of the cell giving ni spikes is given by a Poisson distribution:
f i ( ) ni e  fi ( )
P ( ni /  ) 
ni !
For independently responding neurons, the probability of ni spikes each from k
cells is given by the product of the probabilities:
 f ( )
k
P(n1 , n2 ,...nk /  ) 
ni
i
e  f i ( )
i 1
k
n
i 1
i

Stimulus orientation decoded from cortical responses
Using Bayes formula, the optimal estimate of the stimulus is the peak of the
posterior probability distribution (P(θ) = Probability of θ in natural images) :

k
P ( / n1 , n2 ,...nk ) 
P ( ) f i ( ) ni e  f i ( )

i 1
C
Equivalently the peak of the log of the posterior:
k
k
i 1
i 1
log P( / n1 ...nk )  log P( )  log( C )   ni log f i ( )   f i ( )
Given di cells tuned to each orientation θi the equation is grouped using
average responses:
m
m
i 1
i 1
log P( / n1...nk )  log P( )  log( C )   d i ni log f i ( )   d i f i ( )
Stimulus angle decoded from cortical responses
Using orientation tuned cells in V1, plus cross-orientation inhibition, we derived
a matrix valued tuning function for (V4?) cells selective for angles W composed
of two lines θp and θq :
Fi (   max( g i ( p )  g i ( q )  hi ( p )  hi ( q ),  )
For the prior P(W) we made the rough approximation:
P()  P( p ) * P( q )
Finally, stimulus angles were decoded from the population responses of
orientation tuned cells using an equation similar to that for orientations:
m
m
i 1
i 1
log P( / n1...nk )  log P()  log( C )   d i n i log Fi (   d i Fi ( 
ASSUMPTION: Observer perceives an angle equal to the optimally decoded
angle, i.e. the peak of the posterior probability distribution
Stimulus angle 140º
Decoded oblique angle 142º
Decoded vertical angle 138º
From cortical anisotropy to shape inconstancy
1. We show an oblique bias for 3-D appearance.
2. The 3-D effect can be explained by an oblique bias for 2D angles.
3. Simulations show that the anisotropy in orientation tuning
of cortical neurons plus cross-orientation inhibition
explains the 2-D oblique bias.
4. Anisotropy in numbers of cells predicts the opposite bias.
5. The predictions were insensitive to the prior distribution.
Consequences of the oblique bias for
angle perception
Zucker et al
Tse
Cohen &
Singh
Fleming et al
Conclusions
1. If the perception of 3D shape depends on
the extraction of simple image features,
then bias in the appearance of the image
features will lead to bias in the appearance
of 3D shape.
2. Variations in properties within neural
populations can have direct effects on
visual percepts, and need to be included in
neural decoding models.
Cohen EH and Zaidi Q Fundamental failures of shape constancy due to
cortical anisotropy. Journal of Neuroscience (2007).
Perceived angles decoded from
cortical responses
Having traced 3-D perceptual anisotropy to
an oblique bias for 2-D angles, we used a
probabilistic stimulus decoding model
(Sanger, 1996) to test whether this 2-D
bias could be explained by anisotropies in
numbers or tuning widths of cortical cells
tuned to different orientations (Li, et al.,
2003), or the anisotropic distribution of
oriented energy in images of natural
scenes (Hansen & Essock, 2004). We first
derived the probabilities of numbers of
spikes from individual orientation tuned
cells in response to an angle stimulus.
Using Bayes’ formula, we then decoded the
most probable angle given the population
response. To compare the model’s
predictions with the experimental
measurements, we assumed that the
observer perceives an angle equal to the
optimally decoded angle.
Neural Population Decoding Model
To show how we decode angles from population responses, we begin with decoding the
orientation of a single line. We make the assumption that a cell tuned to the orientation θ i
gives an action potential in response to any orientation θ, with the probability:
P ( spike | ! ) ² f i (! )
(1)
where f i (! ) is the tuning curve of the cell with respect to θ, i.e. the average firing rate for
each value of θ. If the probability of firing is a constant within time intervals of the same
length, then the firing is governed by a Poisson distribution. The probability of the cell
tuned to θi firing n spikes is given by:
f i (! ) ni e ² f i (! )
P ( ni / ! ) =
ni !
(2)
Justifications, alternatives, and implications of the model
To obtain the decoding solution, a number of assumptions were made: (i)
Shapes of orientation tuning curves are not constrained but are assumed to
be invariant to signal-strength, based on orientation tuning curves in V1
being contrast-invariant (Sclar & Freeman, 1982). (ii) The variation in firing
rates of cortical neurons is described by Poisson statistics, but more general
Poisson-like exponential distributions would suffice (Ma et al., 2006). (iii)
The assumption that responses of cells tuned to different angles at different
orientations are independent leads to a simple Bayes-optimal solution, but
noise in the cortex is correlated across cells. However, a decoding model
incorporating the structure of neural correlations (Ma et al., 2006) requires
empirical estimates that do not yet exist, and the relatively constant
variability observed across cortical stages suggests that noise correlations
may be propagated by down-stream neurons (Shadlen & Newsome, 1998).
In addition, a neural-correlation function based on similarities between
preferred stimuli changes the variance of the likelihood function but not
measures of central tendency (Jazayeri & Movshon, 2006).
Equation 7 provides a way to simulate cells sensitive to
specific angles at specific orientations, and could be used
to predict tuning curves for such cells in V2 and V4
(Pasupathy & Connor, 1999; Ito & Komatsu, 2004). This
equation takes into account cross-orientation inhibition
between V1 cells, so responses of V1 cells are not
assumed to be independent in the model. V1 cells also
have anatomical links to cells with co-oriented and coaxially aligned receptive fields (Bosking et al., 1997).
Such long-range excitatory connections could facilitate the
extraction of curved contours (Ben-Shahar & Zucker,
2003). Replacing cross-orientation inhibition in the model
with such excitatory V1 connections, leads to predictions
that overestimate the vertical angle and underestimate
the oblique angle, i.e. distortions opposite to the observed
perceptual bias. A similarly incorrect prediction was
obtained if cross-orientation inhibition was replaced by a
stronger divisive gain for horizontally tuned cells than for
obliquely tuned cells (Hansen & Essock, 2004).
This model can be viewed as a formal embodiment of Mach’s (1897) idea of
“contrast in directions” which he proposed as an explanation for why obtuse
angles tend to appear contracted and acute angles tend to appear expanded
(Wundt, 1862). In the simulations, we found that contraction of obtuse angles
is not a general result of orientation contrast, as presupposed by Blakemore et
al. (1970), but occurs only for certain relative widths of excitation and
inhibition. Mach’s second explanation for angle distortions, invoked the
projective tendency of acute angles in the image to originate from 3-D angles
that are greater than their projections and obtuse projections to arise from
smaller 3-D angles (quantified by Nundy et al, 2000). To explain our results,
this hypothesis requires that the 3-D angles in the world that project to
oblique obtuse angles be wider on average than the 3-D angles that project to
vertical obtuse angles. We used Equation 8 as an approximation to the
frequency of image angles in natural scenes. The model was insensitive to
P(W). It is likely that tuning-width anisotropies will also explain Bouma &
Andriessen’s (1970) result that the magnitude of the induced effect on the
perceived orientation of a line segment depends on the orientations of the
inducing and test lines.