Transcript 11-1

Inference and Computation with
Population Codes
Alexandre Pouget, Peter Dayan, and Richard S. Zemel
Annual review of neuroscience 2003
Presenter : Sangwook Hahn, Jisu Kim
Inference and Computation with
Population Codes
1 / 41
13 November 2012
Outline
1. Introduction
2. The Standard Model ( First Part )
1. Coding and Decoding
2. Computation with Population Codes
3. Discussion of Standard Model
3. Encoding Probability Distributions ( Second Part )
1. Motivation
2. Psychophysical Evidence
3. Encoding and Decoding Probability Distributions
4. Examples in Neurophysiology
5. Computations Using Probabilistic Population Codes
Inference and Computation with
Population Codes
2 / 41
13 November 2012
Introduction

Single aspects of the world –(induce)> activity in multiple neurons

For example
air current
– 1. Air current is occurred by predator of cricket
– 2. Determine the direction of an air current
– 3. Evade with other direction from predicted predator’s move
Inference and Computation with
Population Codes
3 / 41
13 November 2012
Introduction

Analyze the example at the view of neural activity
air current
– 1. Air current is occurred by predator of cricket
– 2. Determine the direction of an air current
( i. population of neurons encode information about single variable
ii. information decoded from population activity )
– 3. Evade with other direction from predicted predator’s move
Inference and Computation with
Population Codes
4 / 41
13 November 2012
Guiding Questions (At First Part)

Q1:
How do populations of neurons encode information about single variables?
How this information can be decoded from the population activity?
How do neural populations realize function approximation?

Q2:
How population codes support nonlinear computations
over the information they represent?
Inference and Computation with
Population Codes
5 / 41
13 November 2012
The Standard Model – Coding

Cricket cercal system has hair cells (a) as primary sensory neurons

Normalized mean firing rates of 4 low-velocity interneurons

s is the direction of an air current (induced by predator)
Inference and Computation with
Population Codes
6 / 41
13 November 2012
The Standard Model – Encoding Model

Mean activity
–
–

of cell a depends on s
: maximum firing rate
: preferred direction of cell a
Natural way of describing tuning curves
– proportional to the
threshold projection
of v onto
Inference and Computation with
Population Codes
7 / 41
13 November 2012
The Standard Model – Decoding

3 methods to decode homogeneous population codes
– 1. Population vector approach
– 2. Maximum likelihood decoding
– 3. Bayesian estimator

Population vector approach ( sum )
–
: population vector
–
: preferred direction
–
: actual rates from the mean rates
–
: approximation of wind direction (r is noisy rates)
Inference and Computation with
Population Codes
8 / 41
13 November 2012
The Standard Model – Decoding

Main problem of population vector method
– It is not sensitive to the noise process that generates
– However, it works quite well
– Estimation of wind direction to
within a few degrees is possible
only with 4 noisy neurons
Inference and Computation with
Population Codes
9 / 41
13 November 2012
The Standard Model – Decoding

Maximum likelihood decoding
– This estimator starts from the full probabilistic encoding model
by taking into account the noise corrupting neurons activities
– A
– A
– If
is high -> those s values are likely to the observed activities
– If
is low -> those s values are unlikely to the observed activities
rms = root mean
square
deviation
Inference and Computation with
Population Codes
10 / 41
13 November 2012
The Standard Model – Decoding

Bayesian estimators
– Combine likelihood P[r|s] with any prior information about stimulus s
to produce a posterior distribution P[s|r] :
– If prior distribution P[s] is flat, there is no specific prior information of s
and this is renormalization version of likelihood
– Bayesian estimator does a little better
than maximum likelihood
and population vector
Inference and Computation with
Population Codes
11 / 41
13 November 2012
The Standard Model – Decoding

In homogenous population
– Bayesian & Maximum likelihood decoding >>> population vector
– ‘the greater the number of cells is ,
the greater the accuracy is’
since more cells can provide more information about stimulus
Inference and Computation with
Population Codes
12 / 41
13 November 2012
Computation with Population Code

Discrimination
– If there are
and
where
is a small angle,
we can use Bayesian poesterior (P[s|r]) in order to discriminate those
– It is also possible to perform discrimination based directly on activities
by computing a linear :
–
: usually 0 for a homogeneous population code
–
Inference and Computation with
Population Codes
: Relative weight
13 / 41
13 November 2012
Computation with Population Code

Noise Removal
– Maximum likelihood estimator is unclear
about its neurobiological relevance.
• 1. finding a single scalar value seems unreasonable
because population codes seem to be used throughout the brain
• 2. while finding maximum likelihood value is difficult in general
– Solution : utilizing recurrent connection within population
to make it behave like an autoassociative memory
• Autoassociative memories use nonlinear recurrent interactions
to find the stored pattern that most closely matches a noisy input
Inference and Computation with
Population Codes
14 / 41
13 November 2012
Computation with Population Code

Basis Function Computations
– Function approximation compute the output of functions
for the case of multiple stimulus dimensions.
– For example,
– sh : head-centered direction to a target
sr : eye-centered direction
se : position of eyes in the head
Inference and Computation with
Population Codes
15 / 41
13 November 2012
Computation with Population Code

Basis Function Computations
Inference and Computation with
Population Codes
16 / 41
13 November 2012
Computation with Population Code

Basis Function Computations
– linear solution for homogeneous population codes
(mapping from one population code to another, ignoring noise )
Inference and Computation with
Population Codes
17 / 41
13 November 2012
Guiding Questions (At First Part)

Q1:
How do populations of neurons encode information about single variables?
-> p.6~7
How this information can be decoded from the population activity?
-> p.8~12
How do neural populations realize function approximation?
-> p.13~14

Q2:
How population codes support nonlinear computations
over the information they represent?
-> p.15~17
Inference and Computation with
Population Codes
18 / 41
13 November 2012
Encoding Probability Distributions
Inference and Computation with
Population Codes
19 / 41
13 November 2012
Motivation

The standard model has two main restrictions :

We only consider uncertainty coming from noisy neural activities. (internal
noise)
: Uncertainty is inherent, independent of internal noise.

We do not consider anything other than estimating the single value.
: Utilizing the full information contained in the posterior is crucial.
Inference and Computation with
Population Codes
20 / 41
13 November 2012
Motivation

“ill-posed problems” : images do not contain enough information.

The aperture problem.
: Images does not unambiguously specify the motion of the object.

Solution - probabilistic approach.
: perception is conceived as statistical inference giving rise to probability
distributions over the values.
Inference and Computation with
Population Codes
21 / 41
13 November 2012
Motivation
Inference and Computation with
Population Codes
22 / 41
13 November 2012
Psychophysical Evidence

Perceived speed of a grating increases with contrast.

Nervous system seeks the posterior distribution of velocity given the image
sequence, obtained through Bayes rule:
𝑃𝑠𝑰 =

𝑃
𝑰 𝑠 𝑃[𝑠]
𝑃[𝑰]
High contrast -> likelihood function becomes narrow
-> likelihood dominates product 𝑃 𝑰 𝑠 𝑃[𝑠]
Inference and Computation with
Population Codes
23 / 41
13 November 2012
Psychophysical Evidence
Inference and Computation with
Population Codes
24 / 41
13 November 2012
Encoding and Decoding Probability Distributions

Log-likelihood method :

The activity of a neuron tuned to prefer velocity v is viewed as reporting
the log-likelihood function of the image given the motion log(𝑃 𝑰 𝒗 )

Provides a statistical interpretation, and decoding only involves the simple
operation of exponentiating to find the full likelihood.

Some schemes for computing require that the likelihood only have one
peak.
Inference and Computation with
Population Codes
25 / 41
13 November 2012
Encoding and Decoding Probability Distributions

Gain encoding for Gaussian distributions :

Using Bayesian approach to decode a population pattern -> 𝑃[𝑠|𝒓]

Assuming independent noise in the response of neurons.
-> posterior distribution converges to Gaussian.

Gain of the population activity controls the standard deviation of the
posterior distribution.

Strong limitation : only viably work for simple Gaussians.
Inference and Computation with
Population Codes
26 / 41
13 November 2012
Encoding and Decoding Probability Distributions
Inference and Computation with
Population Codes
27 / 41
13 November 2012
Encoding and Decoding Probability Distributions

Convolution encoding :

Can deal with non-Gaussian distributions that cannot be characterized by a
few parameters, such as their means and variances.

Represent the distribution using a convolution code, obtained by
convolving the distribution with a particular set of kernel functions.
Inference and Computation with
Population Codes
28 / 41
13 November 2012
Encoding and Decoding Probability Distributions

Motivation : Fourier transform
𝑓 ∶ 2𝜋-periodic, odd function (𝑓 −𝑥 = −𝑓 𝑥 )
𝑒𝑛𝑐𝑜𝑑𝑖𝑛𝑔
𝑓
{𝑎𝑛 |𝑛 ∈ 𝑁}
𝑑𝑒𝑐𝑜𝑑𝑖𝑛𝑔

Encoding : 𝑎𝑛 = 𝜋
𝜋
𝑓(𝑥) sin(𝑛𝑥) 𝑑𝑥
−𝜋

Decoding : 𝑓 𝑥 =
∞
𝑛=1 𝑎𝑛 sin(𝑛𝑥)
1
Inference and Computation with
Population Codes
29 / 41
13 November 2012
Encoding and Decoding Probability Distributions

Use large neuronal population of neurons to encode any function by
devoting each neuron to the encoding of one particular coefficient.

The activity of neuron a is computed by taking the inner product between
a kernel function assigned to that neuron and the function being encoded.
Inference and Computation with
Population Codes
30 / 41
13 November 2012
Encoding and Decoding Probability Distributions

Encoding schemes

Kernel – sine function :
𝑓𝑎 (𝑃 𝑠 𝑰 ) =

Kernel – Gaussian : Gaussian kernel
𝑓𝑎 (𝑃 𝑠 𝑰 ) =

𝑑𝑠 sin 𝑤𝑎 𝑠 + ∅𝑎 𝑃[𝑠|𝑰]
𝑑𝑠 𝑒𝑥𝑝 −
(𝑠 − 𝑠𝑎 )2
2𝜎𝑎 2
𝑃[𝑠|𝑰]
Kernel – Gaussian, 𝑃 𝑠 𝑰 = 𝛿(𝑠, 𝑠 ∗ ) :
𝑓𝑎 (𝑃 𝑠 𝑰 ) = 𝑒𝑥𝑝 −
Inference and Computation with
Population Codes
31 / 41
(𝑠 ∗ − 𝑠𝑎 )2
2𝜎𝑎 2
13 November 2012
Encoding and Decoding Probability Distributions

Decoding scheme - Anderson’s approach

Activity 𝑟𝑎 if neuron a is considered to be a vote for a particular decoding
basis function 𝑃𝑎 [𝑠].

Overall distribution decoded : 𝑃 𝑠 𝑰 =
Inference and Computation with
Population Codes
32 / 41
𝑎 𝑟𝑎 𝑃𝑎 [𝑠]
𝑏 𝑟𝑏
13 November 2012
Encoding and Decoding Probability Distributions

Decoding scheme - Zemel’s approach

Probabilistic approach : recover the most likely distribution over s, 𝑃[𝑠|𝑰]

Can be achieved using a nonlinear regression method such as the
Expectation-Maximization algorithm.
Inference and Computation with
Population Codes
33 / 41
13 November 2012
Examples in Neurophysiology

Uncertainty in 2-AFC (2-alternative forced choice)
: examples offer preliminary evidence that neurons represent probability
distributions, or related quantities, such as log likelihood ratios.

There are also experiments supporting gain encoding, convolution codes,
and DDPC, respectively.
Inference and Computation with
Population Codes
34 / 41
13 November 2012
Computations Using Probabilistic Population Codes

Experiment by Ernst & Banks (2002) : judge the width of a bar

The optimal strategy : Recovering the posterior distribution over the width
w, given the image V and haptic H

Using Bayes rule : 𝑃 𝑤 𝑉, 𝐻 ∝ 𝑃 𝑉, 𝐻 𝑤 𝑃 𝑤 = 𝑃 𝑉 𝑤 𝑃 𝐻 𝑤 𝑃[𝑤]
Inference and Computation with
Population Codes
35 / 41
13 November 2012
Computations Using Probabilistic Population Codes

If we use convolution code for all distributions
– multiply all the population codes together term by term
– requires neurons that can multiply or sum : achievable neural operation

If the probability distributions are encoded using the position and gain of
population codes
– Solution : Deneve et al. (2001)
– Some limitations
– Performs a Bayesian inference using noisy population codes
Inference and Computation with
Population Codes
36 / 41
13 November 2012
Computations Using Probabilistic Population Codes
Inference and Computation with
Population Codes
37 / 41
13 November 2012
Guiding Questions(At Second Part)

Q3: How may neural populations offer a rich representation of such things
as uncertainty in the aspects of the stimuli they represent?

# 21 ~ # 24

Probabilistic approach : perception is conceived as statistical inference
giving rise to probability distributions over the values.

Hence stimuli of neural populations represents probability distributions,
which gives information of uncertainty.
Inference and Computation with
Population Codes
38 / 41
13 November 2012
Guiding Questions(At Second Part)

Q4: How can populations of neurons represent probability distributions?
How can they perform Bayesian probabilistic inference?

#25 ~ #31 (for first), #37 ~ #39 (for second)

Several schemes have been proposed for encoding probability distributions
in populations of neurons : Log-likelihood method, Gain encoding for
Gaussian distributions, Convolution encoding.

Bayesian probabilistic inference can be done by multiply all the population
codes (convolution encoding), or using noisy population codes (gain
encoding)
Inference and Computation with
Population Codes
39 / 41
13 November 2012
Guiding Questions(At Second Part)

Q5: How multiple aspects of the world are represented in single
populations? What computational advantages (or disadvantages) such
schemes have?

# 25 ~ # 28 (first)

Log-likelihood : likelihood
Gain encoding : mean and standard deviation
Convolution encoding : probability distribution
Inference and Computation with
Population Codes
40 / 41
13 November 2012
Guiding Questions(At Second Part)

Q5: How multiple aspects of the world are represented in single
populations? What computational advantages (or disadvantages) such
schemes have?

# 25 ~ # 28 (second)

Log-likelihood : decoding is simple, but some distribution limitation
Gain encoding : strong distribution limitation.
Convolution encoding : can work for complicated distribution.
Inference and Computation with
Population Codes
41 / 41
13 November 2012