Computational vision --- a window to our brain

Download Report

Transcript Computational vision --- a window to our brain

Computational Vision --- a window to our brain
Li Zhaoping, University College London
At CUSPEA conference on
“Physics in twenty first century”
June 2002, Beijing
From eye to primary visual cortex
Vision: solving for 3D scene from 2D image
A difficult problems, many possible solutions, one (or a few) perceptions
Two of the most difficult problems
(1) Object invariance, (2) visual segmentation, still unsolved.
The segmentation problem
Dilemma: segmentation requires recognition,
recognition requires segmentation.
Human Eye and Retina
Human Cone Lattice
Retina sampling (From B. Olshausen)
Receptive fields of a retinal ganglion cell (on center cell )
Two cells in primary visual cortex
Neurons in Human
Cortex
Visual areas in the brain
Visual Area wiring diagram
Some numbers:
Retinal size: 5 cm x 5 cm; 0.4 mm thick
One degree of visual angle = 0.3 mm on the retina
Number of cones in each retina: 5x106
Number of rods in each retina: 108
Peak cone density: 1.6 x 105 cones/mm
Total number of cortical neurons: 1010;
4 x 103 synapses/neuron;
Number of macaque (monkey) visual areas: 30
Size of each area V1: 3 cm by 8 cm
Half of area V1 represents the central 10 deg (2% of the visual field)
Binocular Rivalry
Color Phenomena
Hering Illusion
Hermann Grid Illusion
A variaton
Subjective Contours
(Kaniza triangle)
Color Afterimage
Neon Disk
Adelson Illusion
Color Illusion
Reversible ripple
Focus on segmentation problem in
vision --- region segmentation
A region can be characterized by its average
luminance, regularity, smoothness, and many
other measures.
Biological experimental background
In early visual stages (retina or primary visual cortex),
output from a neuron can be well or poorly
approximated as images passed through a linear filter
(kernel). These filters are called receptive fields.
These filters (kernels K) are all very small,
compared to the sizes of visual objects
(e.g., apple)
Local interactions between cortical neurons
As manifested in
experiments showing
contextual influences of
neural responses.
From local to global !!!
V1 model
My Theory: V1 produces a saliency map from images
Input to
model
Highlighting important image
locations that signal the
breaking of translation
invariance
Details of this theory is published in Trends in Cognitive Sciences, Vol. 6. No. 1, Jan, 2002, p. 9-16.
Examples of pre-attentive segmentation explained by the model
The theory:
•explains many experimental data;
•linking physiological and psychological data (two
research communities do not talk to each other as much
without such theories);
•provides experimental testable predictions; motivating
new experiments, and promoting theoretical approach in
the traditionally experimental field.
•calls for new explorations of interesting mathematics of
dynamical systems not yet encounted in traditional
physics.