Transcript PPT

Vision in Humans
and Machines
September 10, 2009
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
1
Visible light is just a part of the
electromagnetic spectrum
September 10, 2009
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
2
Cross Section of the Human Eye
September 10, 2009
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
3
Anatomy of the Visual System
 The Eyes
 Cornea:
 Transparent outer covering of the eye that
admits light
 Pupil:
 Adjustable opening in the iris that regulates
the amount of light that enters the eye
 Iris:
 Pigmented ring of muscles situated behind
the cornea
September 10, 2009
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
4
Anatomy of the Visual System
 Photoreceptors
 Retina:
 The neural tissue and photoreceptive cells
located on the inner surface of the posterior
portion of the eye.
 Rod:
 Photoreceptor cells of the retina, sensitive to
light of low intensity.
 Cone:
 Photoreceptor cells of the retina; maximally
sensitive to one of three different wavelengths
of light and hence encodes color vision.
6
Anatomy of the Visual System
 The Eyes
 Lens:
Consists of a series of transparent, onion-like
layers. Its shape can be changed by
contraction of ciliary muscles.
 Accommodation:
 Changes in the thickness of the lens,
accomplished by the ciliary muscles, that
focus images of near or distant objects on the
retina
7
8
Anatomy of the Visual System
 The Eyes
 Fovea:
 Area of retina that mediates the most acute
vision. Contains only color-sensitive cones.
 Optic Disk:
 Location on retina where fibers of ganglion
cells exit the eye. Responsible for the blind
spot.
9
Coding of Visual Information in the Retina
 Coding of Light and Dark
 Receptive field:
 That portion of the visual field in which the
presentation of visual stimuli will produce an
alteration in the firing rate of a particular
neuron.
10
Photoreceptor
Bipolar
Ganglion
11
Major cell types of the retina
12
Receptive fields
13
Color Mixing
14
Coding of Visual Information in the Retina
 Photoreceptors: Trichromatic Coding
 Peak wavelength sensitivities of the three cones:
Blue cone:
ShortBlue-violet (420 nm)
Green cone:
MediumGreen (530 nm)
Red Cone:
LongYellow-green (560nm)
15
16
Coding of Visual Information in the Retina
 Retinal Ganglion Cells:
 Opponent-Process Coding
 Negative afterimage:
 The image seen after a portion of the retina is exposed to an
intense visual stimulus; consists of colors complimentary to
those of the physical stimulus.
 Complimentary colors:
 Colors that make white or gray when mixed together.
17
18
Analysis of Visual Information
 Anatomy of the Striate cortex
 David Hubel and Torsten Wiesel
 1960’s at Harvard University
 Discovered that neurons in the visual cortex did
not simply respond to light; they selectively
responded to specific features of the visual world.
19
20
21
Stimuli in
receptive
field of
neuron
22
Cat V1 (striate
cortex)
Orientation
preference
map
Ocular
dominance
map
23
24
“Data Flow Diagram”
of Visual Areas in
Macaque Brain
Blue:
motion perception
pathway
Green:
object recognition
pathway
September 10, 2009
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
25
Computer Vision
A typical computer vision applications are complex
and consist of different levels of processing, from
the low-level pixel-by-pixel analysis to the high-level
creation of scene descriptions.
Generally, computer vision systems consist of an
image processing stage, followed by a scene
analysis stage.
The following slide outlines the structure of a
computer vision system.
September 10, 2009
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
26
Computer Vision
A simple two-stage model of computer vision:
Image
processing
Bitmap
image
Scene
analysis
Scene
description
feedback (tuning)
Prepare image for
scene analysis
September 10, 2009
Build an iconic
model of the world
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
27
Computer Vision
The image processing stage prepares the input
image for the subsequent scene analysis.
Usually, image processing results in one or more new
images that contain specific information on relevant
features of the input image.
The information in the output images is arranged in
the same way as in the input image. For example, in
the upper left corner in the output images we find
information about the upper left corner in the input
image.
September 10, 2009
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
28
Computer Vision
The scene analysis stage interprets the results from
the image processing stage.
Its output completely depends on the problem that the
computer vision system is supposed to solve.
For example, it could be the number of bacteria in a
microscopic image, or the identity of a person
whose retinal scan was input to the system.
September 10, 2009
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
29
Digitizing Visual Scenes
With regard to spatial resolution, we will map the
intensity in our image onto a two-dimensional finite
array:
y’
[0, 0]
[0, 1]
[0, 2]
[0, 3]
[1, 0]
[1, 1]
[1, 2]
[1, 3]
[2, 0]
[2, 1]
[2, 2]
[2, 3]
x’
September 10, 2009
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
30
Thresholding
Here, the right image is created from the left image by
thresholding, assuming that object pixels are darker
than background pixels.
As you can see, the result is slightly imperfect (dark
background pixels).
September 10, 2009
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
31
Geometric Properties
September 4, 2007
Computer Vision
Lecture 1: Digital Images/Binary Image Processing
32
Geometric Properties
We could teach our program what the objects look
like at different sizes and orientations, and let the
program search all possible positions in the input.
However, that would be a very inefficient and
inflexible approach.
Instead, it is much simpler and more efficient to
standardize the input before performing object
recognition.
We can scale the input object to a given size, center
it in the image, and rotate it towards a specific
orientation.
September 10, 2009
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
33
Noise Reduction
Here, a size filter perfectly removes all noise in the
input image.
September 10, 2009
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
34
Noise Reduction
However, if our threshold is too high, “accidents” may
happen.
September 10, 2009
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
35
Edge Detection
Calculating the magnitude of the brightness gradient
with a Sobel filter. Left: original image; right: filtered
image.
September 10, 2009
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
36
Texture
September 10, 2009
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
37
Texture
Texture is an important cue for biological vision
systems to estimate the boundaries of objects.
Also, texture gradient is used to estimate the
orientation of surfaces.
For example, on a perfect lawn the grass texture is
the same everywhere.
However, the further away we look, the finer this
texture becomes – this change is called texture
gradient.
For the same reasons, texture is also a useful feature
for computer vision systems.
September 10, 2009
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
38
Texture Gradient
September 10, 2009
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
39
Texture
The most fundamental question is: How can we
“measure” texture, i.e., how can we quantitatively
distinguish between different textures?
Of course it is not enough to look at the intensity of
individual pixels.
Since the repetitive local arrangement of intensity
determines the texture, we have to analyze
neighborhoods of pixels to measure texture
properties.
September 10, 2009
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
40
Stereo Vision
Geometry of binocular stereo vision
September 10, 2009
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
41
Statistical Pattern Recognition
September 10, 2009
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
42
Object Recognition
This algorithm learns to recognize 25 different chairs:
It is shown each chair from 25 different viewing angles.
September 10, 2009
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
43
The Algorithm
September 10, 2009
Introduction to Cognitive Science
Lecture 2: Vision in Humans and Machines
44