Transcript Document
MULTIMEDIA
SIGNAL PROCESSING
BASIC PROBLEMS IN PROCESSING
MEDIA INFORMATION
MMSP
Irek Defée
Kinect – new media interface
• Before we proceed we mention important
development in the progress of media interfaces
• This is a device and system called Kinect made
by Microsoft. Kinect is available as product
from the beginning of November 2010
Kinect is a part of Microsoft Xbox game platform
but it can be bought separately!
MMSP
Irek Defée
What is Kinect?
• Kinect is a new type of hardware for
interacting with people - with proper software
support of course
• Kinect looks like this
MMSP
Irek Defée
What is inside Kinect?
There is a hardware worth about 40 euro, working in the following schematics
plus software which extracts signals and sends them for processing to Xbox.
Processing takes about 5% of Xbox power (Xeon processor)
MMSP
Irek Defée
How the Kinect works?
Kinect has FOUR microphones to retrieve spatial sound and attenuate noise,
interferences and compensate for room acoustics
Kinect has small color camera with 640x480 resolution
MMSP
Irek Defée
Most advanced aspect
Kinect ”eyes”
Eyes of Kinect are made by ab INFRARED MEASUREMENT SYSTEMLaser beam is send from the objective and received by sensor as can be seen above.
These sensors can move to adjust for the distance and height. This device
produces MAP OF DEPTH to objects in a room.
The device can thus ’see’ in bad light or in darkness. Before the use it is TRAINED
with movements of persons in the room. You can see on the right that in infrared
the beam makes lots of measurement dots
MMSP
Irek Defée
What Kinect does?
Kinect recognizes voice IN ROOMS and can be used for voice control of applications
Kinect recognizes persons and body movements which is used in applications
But before this Kinect is TRAINED interactively like shown in pictures
After the training person and body movements will be
recognized. More than one person can be identified in a
scene
MMSP
Irek Defée
Why Kinect is revolutionary?
• It is the first practical natural interface for
machines communicating with people
• It works in normal rooms
• It is combining acoustical and visual sense
• It is recognizing full body movements, even
complicated ones
• It is recognizing persons
• It works well, it is not perfect but one can
predict there will be much more in the future
MMSP
Irek Defée
Kinect applications
Games and interactive playing (sports, dancing)
More applications: exercising, rehbilitation, child development
Control of devices by voice, gestures
Automation, robotics
More…. we do not know yet… but the public drivers are partially
available
MMSP
Irek Defée
Back to the lectures
• We continue with the overview of the
biological systems and priniciples of
sensory information processing to finish
it with some conclusions
MMSP
Irek Defée
FROM PREVIOUS LECTURES WE KNOW
THAT MULTIMEDIA INFORMATION
PROCESSING IS EXCELLENTLY DONE BY
THE HUMAN INFORMATION PROCESSING
SYSTEM
MMSP
Irek Defée
• OUR PROBLEM IS:
Biological systems perform processing of
audiovisual information using special
”hardware” (which could be called ’wetware’)
and ’software’ that is algorithms.
The question is: Can we make processing of
audiovisual information using different hardware
and software? Maybe algorithms could be similar?
MMSP
Irek Defée
Let us take visual processing as example
IN HUMAN VISUAL
SYSTEM PROCESSING
STARTS IMMEDIATELY IN
THE RETINA AND THERE ARE
COLOR PROCESSING AND
BLACK AND WHITE LIGHT
ACQUISITION AND
PROCESSING SYSTEMS
MMSP
Irek Defée
FROM COLOR AND BLACK &
WHITE RECEPTORS SIGNALS
GO TO INITIAL
PROCESSING ELEMENTS
OUTPUT LINKS
IT IS IMPORTANT TO NOTICE THAT
THE NUMBER OF COLOR PROCESSING
ELEMENTS IS MUCH LOWER THAN
BLACK AND WHITE
MMSP
Irek Defée
• WHAT THESE PROCESSING ELEMENTS
DO?
I
MOST RECENT
MEASUREMENTS OF RETINAL NEURAL
CELLS SHOW THAT THEIR RECEPTIVE
FIELDS ARE QUITE IRREGULAR
IN THE FOLLOWING PAGES SOME
INFORMATION ABOUT WHAT THESE
CELLS ARE DOING IS GIVEN
MMSP
Irek Defée
•
BAR OF LIGHT IS MOVED OVER
PHOTORCEPTORS
IN DIFFERENT DIRECTIONS
OUTPUT OF THE PHOTORECPTORS
IS SUMMED
WITH POSITIVE SIGN
(EXICITATION) OR
NEGATIVE SIGN
(INHIBITION)
MMSP
Irek Defée
DEPENDING ON THE DIRECTION OF MOTION SIGNALS SUM UP
STRONGLY OR NOT
MMSP
Irek Defée
•
HERE THE MEASURED SIGNALS
ARE SHOWN
FOR CELLS WHICH
REACT STRONGLY TO
WHITE BAR ON BLACK
BACKGROUND AND
OPPOSITE (off)
MMSP
Irek Defée
•
HERE WE SEE THE
RESPONSE MEASURED IN
TIME
MMSP
Irek Defée
• WE CAN SEE THAT INITIAL PROCESSING
IN THE EYE INCLUDES
DETECTION OF DIRECTIONAL CHANGES
IN LIGHT INTENSITY
THIS MIGHT BE DONE FOR DIFFERENT
COLORS TOO
MMSP
Irek Defée
WE CAN NOW ASK FOLLOWING QUESTIONS:
WHY THE PROCESSING IS ORGANISED IN
THIS WAY? FOR THE ANSWER WE CAN THINK
THAT THE PROCESSING IS OPTIMISED IN SOME
WAY.
WHAT MIGHT BE OPTIMISATION CRITERIA?
WHAT ARE THE GENERAL PRINCIPLES OF
HUMAN/BIOLOGICAL INFORMATION
PROCESSING?
MMSP
Irek Defée
OVERLAPPING SQUARES OR NOT???
MMSP
Irek Defée
• WHY WE SEE HERE THREE SQUARES
AND NOT CUT OUT SQUARES?
THIS IS BECAUSE THE VISUAL
SYSTEM PRODUCES
INTERPRETATION WHICH IS
MOST PLAUSIBLE (GENERIC)
BUT IT MAY BE WRONG TOO,
ALTHOUGH WE WOULD BE
NOTE THAT ONLY ONE
SURPRISED IT WOULD REALLY
SQUARE IS FULLY
BE!!!
VISIBLE, OTHERS ARE
MMSP
Irek Defée
COVERED, IN FACT
THEY MAY NOT BE
SQUARES
• THE INTERPRETATION PRODUCED IS FOR
DETECTING MOST PROBABLE
OBJECTS
THE UPPER FIGURE IS DETECTED AS
ARCH OVERLAID ON THE
SAWTOOH
THIS IS THE MOST PROBABLE
INTERPRETATION
THE BOTTOM FIGURE
INTERPRETATION IS SURPRISING,
BUT IT COULD ALSO BE PRODUCED
IF THERE WILL BE MORE EVIDENCE
MMSP
Irek Defée
• VISUAL SYSTEM ASSUMES THAT LIGHT
IS COMING FROM TOP
LIGHT DIRECTION
SAME PICTURE UPSIDE DOWN
MMSP
Irek Defée
• The statistics-based system works normally in
almost perfect way. As we could see it fails
sometimes when input signals are highly
improbable and/or if most probable
interpretation is not correct.
This can be seen in visual illusions.
We will look at them closer since recent
statistical approach is explaining them. This
provides for us a hint what kind of processing
is done.
MMSP
Irek Defée
WE CAN NOW ASK FOLLOWING QUESTIONS:
WHY THE PROCESSING IS ORGANISED IN
THIS WAY? FOR THE ANSWER WE CAN THINK
THAT THE PROCESSING IS OPTIMISED IN SOME
WAY.
WHAT MIGHT BE OPTIMISATION CRITERIA?
WHAT ARE THE GENERAL PRINCIPLES OF
HUMAN/BIOLOGICAL INFORMATION
PROCESSING?
MMSP
Irek Defée
Principles we can identify now:
• Statistical processing matched to the real world
signal statistics – provides responses to most
probable signals. This is very natural principle
• Minimization of information processed, as
much information as possible is eliminated,
minimum information needed to provide
response is used. This principle allows to
minimize energy and processing effort.
MMSP
Irek Defée
• A book which appeared in 2005 based on
earlier research:
MMSP
Irek Defée
• The authors are visual psychologists, they
consider vision as a system interpreting world
from images projected onto the eye:
Light from external source
bounces of objects and is
projected. This projection is not
unique (e.g. objects of different
size will have the same projection
depending on their distance
MMSP
Irek Defée
• In visual illusions projection gives rise to
improper interpretation
Natural scene,
illusion persists
Stimuli changes, illusion persists,
MMSP
Irek Defée
This picture gives strong of depth
because of combination of many
mutually consistent cues:
-perspective
-texture gradient
-Shading and shadow
MMSP
Irek Defée
• Geometry of natural scenes
Geometrical illusions represent wrong
interpretation od real world. To find out why
researchers took pictures with depth map
Laser range
scanner for
Measuring distance
Real pictures with corresponding distances
marked by colors
MMSP
Irek Defée
• If large number of such pictures is taken
a database can be created in which real world
objects are matched with distances and
statistics is calculated.
Example: subjective metrics
Let’s think about lines of different lengths which
are seen in real world. If all length would have the
same probability there would be linear relation
between the stimulation for every length. But
if this is not the case, some length will be stimulated
more often. This can lead to distortions in
perception.
MMSP
Irek Defée
• Example: Line length illusion
Variation of apparent length as function of orientation
In experiments people report changing
length depending on angle
MMSP
Irek Defée
• Why it is so? Let’s sample lines in pictures
from database
The points in the picture were
compared with measured by laser
range to see if they correspond to lines
in real world. Total of 1.2x10^7 line
segments were collected
Grid of templates
White – accepted lines,
to overlay on picture Black – rejected lines
with straight lines
Probability distribution of
of lines vs. length for different
orientations
MMSP
Irek Defée
Cumulative distribution
(lines shorter than x)
This shows how many lines
at certain orientation
corresponded to real lines
of length shorter or equal to
x
• Prediction of apparent length based on
probability
Take e.g lines of length 7 at orientation 20 deg,
their cumulative probability is 0.15 which means
that 15% lines is shorter than 7 pixels and 85% is
longer. For all orientations we get this plot
This is very smilar to the one
measured in experiments with
people!!!
MMSP
Irek Defée
• Why such biases exist?
In nature lines do not
appear often, horizontal lines
are typically generated from
horizontal flat surfaces
Vertical lines are limited
by gravity and by this rare
and lines at 20-30deg even more
Rare, and they are mostly
projected from perspective
MMSP
Irek Defée
• Visual illusions: Angles
All angles in this picture have 90 deg
but when they are projected on the eye,
projections may differ up to 60 deg
A)
Bias in angle estimation between two
lines
B,C,D) Angle illusions
MMSP
Irek Defée
• To explain this a database of angles is made, as
before
Extraction of angles
Probability distributions for different
Types of angles (bottom line) in natural
scenes and scenes with human created objects
We can see bias: angles close to 90 deg
are less likely to occur
MMSP
Irek Defée
Probability distribution
of angles is not linear,
cumulative probability
is biased
• Bias and illusions
Angles close to 90 deg are more likely
To come from planar surface, which is
typically larger than surface from lines
interesecting at smaller angles. Thus
90 deg angles are less likely
Thus predicted perceived
angle is different from
actual one, for 90 deg
it is the same
The magnitude of angle
misperception (lines)
vs. experimentally
measured
values
MMSP
Irek Defée
• Explanation of angle illusions
Why vertical line is tilted? We take reference line at 60 deg (black) and check
probability of occurence of physical sources of a second line oriented at different
angles. Since the angle between the lines is 30 deg we look at the probability for 30
deg and then into cumulative probability (previous page) which gives value 0.184
which multiplied by 180 gives angle 33,2 deg in agreement with measurements
MMSP
Irek Defée
• Size illusion
According to the previous explanations
the reason for this illusion is:
Various size illusions of
center and surrounding
Probability distributions of the possible sources
of the targets, given their different contexts,
are different
To check this hypothesis database was
searched for circular objects and probabilities
of the sources of targets in the context were
calculated:
MMSP
Irek Defée
Experimental conditions
a) The inner circle is surrounded by the
4 circles with changing diameters
b) Probability of occurence of center
circle with specific size for outer circles
with different diameters. Dashed line
shows probability for circle with 14 pixels
diameter. (Bigger surrounding circles are much
less likely to appear)
c) Cumulative probability for 14 pixel circle
d) Examples of scenes with large circles
and small circles
Why there are statistical differences? Circles originate from planar projections,
larger circles are less likely.
Why the presence of surrounding circles changes the occurence of target central
circles differently? Larger circles arise from larger planes in the world, they
are flat areas – then it is more probable that the central circle will be larger.
In other words, the presence of larger surrounding circles increases the probability
of of occurence of physical sources of larger central circles. In result probability
Distribution of central circles is changing according to the size of surrounding
circles.
MMSP
Irek Defée
• Changing the interval between center and
surrounding circles
Probabilities when the
distance is changing
Dashed line is for circle
of size 14
Cumulative
Probability for
the 14 pixel circle
MMSP
Irek Defée
• Comparison of inner circle with single circle
a)
b)
c)
Probability distribution of
singel circle vs. diameter
Probability for single circle
superimposed with probability
of central circle surrounded
by outer circle, dashed line
is for 24 pixel circle, probability
curve is for outer circle 32 pixel
diameter, cumulative probability
is much higher – there is bias
When the outer circle is much
bigger the cumulative
probability is smaller
The changing cumulative probability ratios and dependence on the
central and outer circle sizes is well seen – and illusion depends on
these parameters in exactly the same way
MMSP
Irek Defée
• Distance illusions
a)
When objects are close perceived
distance is overestimated to
physical one
b)
Objects which are close to each
other are perceived as being at the
same distance
c)
The distance to close objects is
overestimated, the distance to far
objects is underestimated
d)
Objects on the ground when they
are about 7m distance appear
closer and with increasing
distance they appear more
elevated
MMSP
Irek Defée
According to the methodology probability
distribution of distances is measured but there
are several variables here: Probability of all
distances from
scanner
MMSP
Irek Defée
Probability of the
differences in
distances between
objects for three
different horizontal
angles
Probability of
horizontal distances
different heights
with respect to
eye level
• Interpretation of these probabilities
a)
This curve for all distances has strong peak for distance
of 3m . This is in agreement with experiments in which
people seeing single objects hanging in completely dark
scene report them as being in the distance of 2-4 m
b)
When the angular separation between the objects is small
they tend to be seen at equal distance but this tendency
decreases when the angle is increasing
c)
The dependency of probability of distance vs. eye level
has peak at distance of 4 m. Thus for objects at distance
less than 4 m will be overestimated and those at
distance more than 4 m will be underestimated.
This agrees with experiments
MMSP
Irek Defée
Why this happens?
• The size illusion
Again, for explanation database
is searched for such patterns
and probabilities are calculated.
Here we consider case when both
gigures ar inline, on the left/right
The size illusion
does not depend
nn particular
type of endings
Templates
used
It can be induced
even without line
and even (but less
strongly) with
dots
MMSP
Irek Defée
Templates
overlaid on
pictures
• Results of probability calculations
a)
b)
c)
d)
Figures are in-line extending to the left
or to the right
MMSP
Irek Defée
Probability of lines with specific
length and arrows pointing
inwards and outwards
Cumulative probabilities
Superimposed cumulative
probabilities showing differences
Example of two lines of length
50 pixels. One can see that
cumulative probability for
outward arrows is higher which
corresponds to the bigger length.
• Angle illusion
The line is interrupted by vertical occluder
It is then perceived as two segments shifted
Why this happens?
Again statistics of such patterns is calculated
from the database od pictures
MMSP
Irek Defée
• Templates for calculation
a)
Shows the templates, for each red line
there is one template corresponding to
the shift
b) The templates are matched in the pictures
and statistics can be calculated
c) Other templates can be used for different
configurations of this illusion
d) Definition of the difference in location
of the line segments
MMSP
Irek Defée
• Probability distributions measured
We can see peaks which are at nonzero shift
So the most probable interpretation from this
statistics is that that there is nonzero shift
MMSP
Irek Defée
• One can also study what is the effect of angle
of the line and the width of the distractor
Change of line
orientation
Change of width
of the distractor
As can be seen whent the are larger,
The peak moves towards greater
shifts which implies that the illusion
will be stronger – and it is really so
MMSP
Irek Defée
• The processing of information in biological
systems is statistical – it aims for producing
MOST PROBABLE response to the signals
coming from real world. This type of
processing must be based on statistics of signals
and models from real world. Result of
processing is most plausible answer for
”normal conditions” and assumptions. This we
have seen in the examples before and they are
repeated next.
MMSP
Irek Defée
CONCLUSION
• Statistics based processing seems to be very
strong in explaining visual illusions (many of
them in the same way)
The principle of statistical processing is powerful:
The system collects information about most likely
distribution of signals and provides most
probable interpretations for them. This will work
in most cases. Only when signals are very nontypical
it will fail but this is rare.
MMSP
Irek Defée
BUT….
• We have to remember that biological systems
are able to deal with extreme variations of
signals and still extract right information from
them. This will be illustrated now by the
example of face recognition
Faces can be distorted in many ways and still
recognized. We can guess something about
PRINCIPLES OF FACE PROCESSING
MMSP
Irek Defée
We can recognize FAMILIAR faces from extremely low resolution
pictures.
How this is done? – We do not have clear idea – but it points
to the minimization of processed information
MMSP
Irek Defée
Contour information is not enough
MMSP
Irek Defée
Face is processed somehow as a ”whole” and not as composed
by parts. From the combined picture on the left we see new
face, when we split it we recognize other faces
MMSP
Irek Defée
Eyebrows are very important for the
identification of faces
MMSP
Irek Defée
Faces can be recognized despite extreme distortions
MMSP
Irek Defée
Faces seem to be encoded in memory in exaggerated.
caricature way:
A) Average face (averaged from a number of persons
B) Some typical face
C) Face created by taking bid deviation from average
Such faces are recognized even better than typical ones
MMSP
Irek Defée
Newborn babies turn more attention to more face-like objects
(upper row) than not face-like
MMSP
Irek Defée
Faces and antifaces: If face within green circle is observed for some time
the center one will not be correctly recognized but as one in the red circle
(more distance from the center means more differences)
This means that there is some kind of prototype encoding and tuning to
it
MMSP
Irek Defée
Impact of skin pigmentation
Row 1: Faces differ only in shape
Row 2: Faces differ only in skin pigmentation but not shape
Row 3: Faces differ in shape and pigmentation
We see that pigmentation has significant impact (row 2)
MMSP
Irek Defée
Color helps: Left original
Middle black and white
Right color only, eyes can be located more precisely
MMSP
Irek Defée
From negative picture it is impossible to identify
faces
MMSP
Irek Defée
Face recognition is strongly compensated for the direction of
ilumination, pictures above are easily recognized as same person
MMSP
Irek Defée
Resonse of neural cell of monkey in the face processing
area of the brain. Response to something like face is much
more stronger than for hand. (But remember that milions
and milions of cells are processing at the same time)
Measurement from human brain: signal from face-like pictures
is much stronger than from other objects
MMSP
Irek Defée
The examples shown for faces indicate how sophisticated
is information processing in biological systems.
What is very amazing is getting correct results despite
extreme distortions. For the most part, we do not know
how this is done and we have difficulty in thinking how
To develop algorithms which would have similar
capabilities. This is the topic for studies in the future
MMSP
Irek Defée