Transcript 11-1
Inference and Computation with
Population Codes
Alexandre Pouget, Peter Dayan, and Richard S. Zemel
Annual review of neuroscience 2003
Presenter : Sangwook Hahn, Jisu Kim
Inference and Computation with
Population Codes
1 / 41
13 November 2012
Outline
1. Introduction
2. The Standard Model ( First Part )
1. Coding and Decoding
2. Computation with Population Codes
3. Discussion of Standard Model
3. Encoding Probability Distributions ( Second Part )
1. Motivation
2. Psychophysical Evidence
3. Encoding and Decoding Probability Distributions
4. Examples in Neurophysiology
5. Computations Using Probabilistic Population Codes
Inference and Computation with
Population Codes
2 / 41
13 November 2012
Introduction
Single aspects of the world –(induce)> activity in multiple neurons
For example
air current
– 1. Air current is occurred by predator of cricket
– 2. Determine the direction of an air current
– 3. Evade with other direction from predicted predator’s move
Inference and Computation with
Population Codes
3 / 41
13 November 2012
Introduction
Analyze the example at the view of neural activity
air current
– 1. Air current is occurred by predator of cricket
– 2. Determine the direction of an air current
( i. population of neurons encode information about single variable
ii. information decoded from population activity )
– 3. Evade with other direction from predicted predator’s move
Inference and Computation with
Population Codes
4 / 41
13 November 2012
Guiding Questions (At First Part)
Q1:
How do populations of neurons encode information about single variables?
How this information can be decoded from the population activity?
How do neural populations realize function approximation?
Q2:
How population codes support nonlinear computations
over the information they represent?
Inference and Computation with
Population Codes
5 / 41
13 November 2012
The Standard Model – Coding
Cricket cercal system has hair cells (a) as primary sensory neurons
Normalized mean firing rates of 4 low-velocity interneurons
s is the direction of an air current (induced by predator)
Inference and Computation with
Population Codes
6 / 41
13 November 2012
The Standard Model – Encoding Model
Mean activity
–
–
of cell a depends on s
: maximum firing rate
: preferred direction of cell a
Natural way of describing tuning curves
– proportional to the
threshold projection
of v onto
Inference and Computation with
Population Codes
7 / 41
13 November 2012
The Standard Model – Decoding
3 methods to decode homogeneous population codes
– 1. Population vector approach
– 2. Maximum likelihood decoding
– 3. Bayesian estimator
Population vector approach ( sum )
–
: population vector
–
: preferred direction
–
: actual rates from the mean rates
–
: approximation of wind direction (r is noisy rates)
Inference and Computation with
Population Codes
8 / 41
13 November 2012
The Standard Model – Decoding
Main problem of population vector method
– It is not sensitive to the noise process that generates
– However, it works quite well
– Estimation of wind direction to
within a few degrees is possible
only with 4 noisy neurons
Inference and Computation with
Population Codes
9 / 41
13 November 2012
The Standard Model – Decoding
Maximum likelihood decoding
– This estimator starts from the full probabilistic encoding model
by taking into account the noise corrupting neurons activities
– A
– A
– If
is high -> those s values are likely to the observed activities
– If
is low -> those s values are unlikely to the observed activities
rms = root mean
square
deviation
Inference and Computation with
Population Codes
10 / 41
13 November 2012
The Standard Model – Decoding
Bayesian estimators
– Combine likelihood P[r|s] with any prior information about stimulus s
to produce a posterior distribution P[s|r] :
– If prior distribution P[s] is flat, there is no specific prior information of s
and this is renormalization version of likelihood
– Bayesian estimator does a little better
than maximum likelihood
and population vector
Inference and Computation with
Population Codes
11 / 41
13 November 2012
The Standard Model – Decoding
In homogenous population
– Bayesian & Maximum likelihood decoding >>> population vector
– ‘the greater the number of cells is ,
the greater the accuracy is’
since more cells can provide more information about stimulus
Inference and Computation with
Population Codes
12 / 41
13 November 2012
Computation with Population Code
Discrimination
– If there are
and
where
is a small angle,
we can use Bayesian poesterior (P[s|r]) in order to discriminate those
– It is also possible to perform discrimination based directly on activities
by computing a linear :
–
: usually 0 for a homogeneous population code
–
Inference and Computation with
Population Codes
: Relative weight
13 / 41
13 November 2012
Computation with Population Code
Noise Removal
– Maximum likelihood estimator is unclear
about its neurobiological relevance.
• 1. finding a single scalar value seems unreasonable
because population codes seem to be used throughout the brain
• 2. while finding maximum likelihood value is difficult in general
– Solution : utilizing recurrent connection within population
to make it behave like an autoassociative memory
• Autoassociative memories use nonlinear recurrent interactions
to find the stored pattern that most closely matches a noisy input
Inference and Computation with
Population Codes
14 / 41
13 November 2012
Computation with Population Code
Basis Function Computations
– Function approximation compute the output of functions
for the case of multiple stimulus dimensions.
– For example,
– sh : head-centered direction to a target
sr : eye-centered direction
se : position of eyes in the head
Inference and Computation with
Population Codes
15 / 41
13 November 2012
Computation with Population Code
Basis Function Computations
Inference and Computation with
Population Codes
16 / 41
13 November 2012
Computation with Population Code
Basis Function Computations
– linear solution for homogeneous population codes
(mapping from one population code to another, ignoring noise )
Inference and Computation with
Population Codes
17 / 41
13 November 2012
Guiding Questions (At First Part)
Q1:
How do populations of neurons encode information about single variables?
-> p.6~7
How this information can be decoded from the population activity?
-> p.8~12
How do neural populations realize function approximation?
-> p.13~14
Q2:
How population codes support nonlinear computations
over the information they represent?
-> p.15~17
Inference and Computation with
Population Codes
18 / 41
13 November 2012
Encoding Probability Distributions
Inference and Computation with
Population Codes
19 / 41
13 November 2012
Motivation
The standard model has two main restrictions :
We only consider uncertainty coming from noisy neural activities. (internal
noise)
: Uncertainty is inherent, independent of internal noise.
We do not consider anything other than estimating the single value.
: Utilizing the full information contained in the posterior is crucial.
Inference and Computation with
Population Codes
20 / 41
13 November 2012
Motivation
“ill-posed problems” : images do not contain enough information.
The aperture problem.
: Images does not unambiguously specify the motion of the object.
Solution - probabilistic approach.
: perception is conceived as statistical inference giving rise to probability
distributions over the values.
Inference and Computation with
Population Codes
21 / 41
13 November 2012
Motivation
Inference and Computation with
Population Codes
22 / 41
13 November 2012
Psychophysical Evidence
Perceived speed of a grating increases with contrast.
Nervous system seeks the posterior distribution of velocity given the image
sequence, obtained through Bayes rule:
𝑃𝑠𝑰 =
𝑃
𝑰 𝑠 𝑃[𝑠]
𝑃[𝑰]
High contrast -> likelihood function becomes narrow
-> likelihood dominates product 𝑃 𝑰 𝑠 𝑃[𝑠]
Inference and Computation with
Population Codes
23 / 41
13 November 2012
Psychophysical Evidence
Inference and Computation with
Population Codes
24 / 41
13 November 2012
Encoding and Decoding Probability Distributions
Log-likelihood method :
The activity of a neuron tuned to prefer velocity v is viewed as reporting
the log-likelihood function of the image given the motion log(𝑃 𝑰 𝒗 )
Provides a statistical interpretation, and decoding only involves the simple
operation of exponentiating to find the full likelihood.
Some schemes for computing require that the likelihood only have one
peak.
Inference and Computation with
Population Codes
25 / 41
13 November 2012
Encoding and Decoding Probability Distributions
Gain encoding for Gaussian distributions :
Using Bayesian approach to decode a population pattern -> 𝑃[𝑠|𝒓]
Assuming independent noise in the response of neurons.
-> posterior distribution converges to Gaussian.
Gain of the population activity controls the standard deviation of the
posterior distribution.
Strong limitation : only viably work for simple Gaussians.
Inference and Computation with
Population Codes
26 / 41
13 November 2012
Encoding and Decoding Probability Distributions
Inference and Computation with
Population Codes
27 / 41
13 November 2012
Encoding and Decoding Probability Distributions
Convolution encoding :
Can deal with non-Gaussian distributions that cannot be characterized by a
few parameters, such as their means and variances.
Represent the distribution using a convolution code, obtained by
convolving the distribution with a particular set of kernel functions.
Inference and Computation with
Population Codes
28 / 41
13 November 2012
Encoding and Decoding Probability Distributions
Motivation : Fourier transform
𝑓 ∶ 2𝜋-periodic, odd function (𝑓 −𝑥 = −𝑓 𝑥 )
𝑒𝑛𝑐𝑜𝑑𝑖𝑛𝑔
𝑓
{𝑎𝑛 |𝑛 ∈ 𝑁}
𝑑𝑒𝑐𝑜𝑑𝑖𝑛𝑔
Encoding : 𝑎𝑛 = 𝜋
𝜋
𝑓(𝑥) sin(𝑛𝑥) 𝑑𝑥
−𝜋
Decoding : 𝑓 𝑥 =
∞
𝑛=1 𝑎𝑛 sin(𝑛𝑥)
1
Inference and Computation with
Population Codes
29 / 41
13 November 2012
Encoding and Decoding Probability Distributions
Use large neuronal population of neurons to encode any function by
devoting each neuron to the encoding of one particular coefficient.
The activity of neuron a is computed by taking the inner product between
a kernel function assigned to that neuron and the function being encoded.
Inference and Computation with
Population Codes
30 / 41
13 November 2012
Encoding and Decoding Probability Distributions
Encoding schemes
Kernel – sine function :
𝑓𝑎 (𝑃 𝑠 𝑰 ) =
Kernel – Gaussian : Gaussian kernel
𝑓𝑎 (𝑃 𝑠 𝑰 ) =
𝑑𝑠 sin 𝑤𝑎 𝑠 + ∅𝑎 𝑃[𝑠|𝑰]
𝑑𝑠 𝑒𝑥𝑝 −
(𝑠 − 𝑠𝑎 )2
2𝜎𝑎 2
𝑃[𝑠|𝑰]
Kernel – Gaussian, 𝑃 𝑠 𝑰 = 𝛿(𝑠, 𝑠 ∗ ) :
𝑓𝑎 (𝑃 𝑠 𝑰 ) = 𝑒𝑥𝑝 −
Inference and Computation with
Population Codes
31 / 41
(𝑠 ∗ − 𝑠𝑎 )2
2𝜎𝑎 2
13 November 2012
Encoding and Decoding Probability Distributions
Decoding scheme - Anderson’s approach
Activity 𝑟𝑎 if neuron a is considered to be a vote for a particular decoding
basis function 𝑃𝑎 [𝑠].
Overall distribution decoded : 𝑃 𝑠 𝑰 =
Inference and Computation with
Population Codes
32 / 41
𝑎 𝑟𝑎 𝑃𝑎 [𝑠]
𝑏 𝑟𝑏
13 November 2012
Encoding and Decoding Probability Distributions
Decoding scheme - Zemel’s approach
Probabilistic approach : recover the most likely distribution over s, 𝑃[𝑠|𝑰]
Can be achieved using a nonlinear regression method such as the
Expectation-Maximization algorithm.
Inference and Computation with
Population Codes
33 / 41
13 November 2012
Examples in Neurophysiology
Uncertainty in 2-AFC (2-alternative forced choice)
: examples offer preliminary evidence that neurons represent probability
distributions, or related quantities, such as log likelihood ratios.
There are also experiments supporting gain encoding, convolution codes,
and DDPC, respectively.
Inference and Computation with
Population Codes
34 / 41
13 November 2012
Computations Using Probabilistic Population Codes
Experiment by Ernst & Banks (2002) : judge the width of a bar
The optimal strategy : Recovering the posterior distribution over the width
w, given the image V and haptic H
Using Bayes rule : 𝑃 𝑤 𝑉, 𝐻 ∝ 𝑃 𝑉, 𝐻 𝑤 𝑃 𝑤 = 𝑃 𝑉 𝑤 𝑃 𝐻 𝑤 𝑃[𝑤]
Inference and Computation with
Population Codes
35 / 41
13 November 2012
Computations Using Probabilistic Population Codes
If we use convolution code for all distributions
– multiply all the population codes together term by term
– requires neurons that can multiply or sum : achievable neural operation
If the probability distributions are encoded using the position and gain of
population codes
– Solution : Deneve et al. (2001)
– Some limitations
– Performs a Bayesian inference using noisy population codes
Inference and Computation with
Population Codes
36 / 41
13 November 2012
Computations Using Probabilistic Population Codes
Inference and Computation with
Population Codes
37 / 41
13 November 2012
Guiding Questions(At Second Part)
Q3: How may neural populations offer a rich representation of such things
as uncertainty in the aspects of the stimuli they represent?
# 21 ~ # 24
Probabilistic approach : perception is conceived as statistical inference
giving rise to probability distributions over the values.
Hence stimuli of neural populations represents probability distributions,
which gives information of uncertainty.
Inference and Computation with
Population Codes
38 / 41
13 November 2012
Guiding Questions(At Second Part)
Q4: How can populations of neurons represent probability distributions?
How can they perform Bayesian probabilistic inference?
#25 ~ #31 (for first), #37 ~ #39 (for second)
Several schemes have been proposed for encoding probability distributions
in populations of neurons : Log-likelihood method, Gain encoding for
Gaussian distributions, Convolution encoding.
Bayesian probabilistic inference can be done by multiply all the population
codes (convolution encoding), or using noisy population codes (gain
encoding)
Inference and Computation with
Population Codes
39 / 41
13 November 2012
Guiding Questions(At Second Part)
Q5: How multiple aspects of the world are represented in single
populations? What computational advantages (or disadvantages) such
schemes have?
# 25 ~ # 28 (first)
Log-likelihood : likelihood
Gain encoding : mean and standard deviation
Convolution encoding : probability distribution
Inference and Computation with
Population Codes
40 / 41
13 November 2012
Guiding Questions(At Second Part)
Q5: How multiple aspects of the world are represented in single
populations? What computational advantages (or disadvantages) such
schemes have?
# 25 ~ # 28 (second)
Log-likelihood : decoding is simple, but some distribution limitation
Gain encoding : strong distribution limitation.
Convolution encoding : can work for complicated distribution.
Inference and Computation with
Population Codes
41 / 41
13 November 2012