Lecture 2 - Computer Science
Download
Report
Transcript Lecture 2 - Computer Science
Computational Vision
CSCI 363, Fall 2012
Lecture 2
Introduction to Vision Science
1
What's the Big Deal?
•Vision seems easy. It is effortless for us.
•Building machine vision systems is hard. Machines still
cannot see.
•Understanding how the brain processes visual information is
hard. We still understand only the most basic computations.
•To understand why it is so difficult, we must examine the
problem.
2
The properties of light
•Light is emitted from one or more sources. These may be point
sources or more distributed sources of light.
•The light hits surfaces and interacts with them, with some being
reflected, some absorbed and some transmitted.
•The reflected light may bounce off multiple surfaces before
reaching the eye.
•Some of the light rays will eventually reach the eye and be
focused on the retina.
•We will diagram this in class.
3
The Eye as a Pinhole Camera
•We can approximate the image formation performed by the lens
of the eye as a pinhole camera.
•Light rays from an object project through a single point (the
center of projection) onto the retina (or image plane).
•This is perspective projection.
•We will diagram this in class.
•Objects that are more distant form smaller images. We will work
out the equation in class.
4
Ill-posed problems
•It is not so hard to compute a 2D image from a 3D scene.
•It is very difficult to compute the original 3D scene from the
2D image.
•Many aspects of a 2D scene are consistent with multiple (or
even infinite) possible arrangements of the 3D scene.
•Because there is no single solution, the problem is ill-posed.
5
Visual Difficulties
•An image of a given size can represent a small, nearby object or
a large, distant object.
•Reflectance changes occur at corners of a room or of objects.
How do we see the color as constant?
•How do we know which edges belong to which objects?
•When you move your eyes, the image moves. How do you
know the world is not moving?
6
How do we solve it?
•To solve ill-posed problems, we must make assumptions about
the scene and the objects in it.
•These assumptions are also known as heuristics or constraints.
•We can determine which assumptions are made by the human
visual system by performing psychophysical experiments.
•Psychophysical experiments use a known physical stimulus to
test what information is important for perception and what the
limits of the visual system are.
7
Height in the Visual Field
In normal images, things that
are farther away often have
images that are higher in the
visual field.
The visual system uses this
height in the visual field as a
cue to distance.
We then adjust our
interpretation of the size of the
object based on this distance.
8
A small man
When we remove the
height cue, the second man
looks small indeed!
9
Assumption of Right Angles
The visual system has a tendency to interpret angles as right angles.
We see this as a right angle
This allows us to construct interesting visual illusions such as...
10
The Ames Room
Ever wonder how the hobbits are made to look so short in
"The Lord of the Rings" movies?
11
Marr's approach to Vision
Marr's framework for computational vision:
1) Theory:
What are we computing?
Why are we computing it?
What assumptions are needed?
2) Algorithm
How do we do the computation?
Is it parallel or serial?
Create explicit computational steps.
3) Implementation
Deal with specific hardware to carry out the algorithm
(e.g. transistors vs. neurons).
12
Other ways to segment the
problem
1. Molecules (channels, neurotransmitters, DNA) -> single cells
(dendritic integration, generation of action potentials) ->
small neural networks (retina, hippocampus) -> large neural
networks (visual cortex) -> entire brain.
2. Stages of vision: Retina -> LGN -> Primary visual cortex ->
extrastriate cortex
13
Discussion Questions
According to Marr, we must specify the problem we want to solve.
What is vision good for? How would you specify the tasks that
need to be accomplished?
14