The world of sounds
Download
Report
Transcript The world of sounds
What do we hear for?
Seeing is knowing what is where by looking
(David Marr)
Seeing is predicting what is where, verified
by looking, in order to drink that cup of
coffee
(Reza Shadmehr)
What do we hear for?
Seeing is knowing what is where by looking
(David Marr)
Seeing is predicting what is where, verified by
looking, in order to drink that cup of coffee
(Reza Shadmehr)
Hearing is predicting what will happen next,
verified by listening, in order to know as much as
possible about what’s out there
(Eli Nelken)
Even simple sounds tell
stories
A stupid story
The calm of the sea
Vox balaenae (Voice of the whale)
For flute, cello and piano (cello and piano playing)
George Crumb
A shout of despair
Wozzeck, orchestral transition between scenes 2 and 3 of act 3
Alban Berg
Auditory worlds
• What are sounds?
• What do we hear?
• How do we hear?
Sound As a Pressure Wave
Vibrations of objects set up pressure waves in the surrounding
air.
The “elastic” property of air allows these pressure waves to
propagate (spread).
Structure of sounds
What happens without structure?
Introducing structure
The bird and Chopin
© Gabriel J. Arsante
Structure of sounds
© Gabriel J. Arsante
What are sounds?
• Structure at a lot of time scales
• Perceptual correlates:
– Melodies (1 s)
– Notes (0.1 s)
– Pitch (much faster than 0.01 s)
Peripheral processing of
sounds
Outer Ear
Inner Ear
Middle Ear
Outer Ear
Inner Ear
Middle Ear
Outer Ear
Inner Ear
Middle Ear
Outer Ear
Inner Ear
Middle Ear
Cross Section of Cochlea
“Travelling Wave” Along the Basilar
Membrane
Von Békésy
Travelling Wave Peaks at Different
Locations As the Frequency Changes
Inner Hair Cells
Outer Hair Cells
A simple neuron in the auditory
system
BF
The auditory
pathways
Responses of simple neurons
to complex sounds
A set of complex sounds
Orig
Slow
In consequence…
The neurogram
We get a very rich and precise
representation of the incoming
sound at the level of the
auditory nerve
The sound and its
components
full
337
600
2000
Brahms, Geistlisches Wiegenlied
Op. 91 no. 2
Kathleen Ferrier, Phyllis Spurr,
Max Gilbert
Is that enough?
(do we hear the spectrogram?)
What are the perceptual qualities of
sounds?
“The basic elements of any sound are
loudness, pitch, contour, duration (or
rhythm), tempo, timbre, spatial location,
and reverberation.”
(D.J. Levitin, This is Your Brain on Music: The Science of a Human
Obsession, p.14)
The Long Road from Spectrogram
to Perception
• How do we go from the ‘neurogram’ to
‘loudness, pitch, contour, duration (or
rhythm), tempo, timbre, spatial location,
and reverberation’?
Relationships with low-level
features…
• Loudness with sound intensity
– Encoded by some population-averaged activity
• Pitch with periodicity
Filtered clicks Pure tones
SAM
IRN
Pitch: examples
Iterated
Filtered
AMpure
ripple
(3 kHz)
clicks
noise
Time
Relationships with low-level
features…
• Loudness with sound intensity
– Encoded by some population-averaged activity
• Pitch with periodicity
– Periodicity IS NOT frequency!
• Contour with slow amplitude modulations
– Encoded in the range of 1-10 Hz very clearly at the level of A1
(e.g. Shamma and collaborators)
– But not slower than that (probably)
•
•
•
•
Duration/rhythm with ???
Tempo with ???
Timbre with spatial activation patterns (e.g. in A1)
Spatial location with ITD/ILD/spectral activation patterns
– Low-level information available at the CN/SOC
– But requires integration
• Reverberation with ???????
The Long Road from Spectrogram
to Perception
• Pitch, timbre, phonemic identity, and so on are
‘separable’ – they are independent of each other
• They represent high-level generalizations
– Many different sounds have the same pitch (violin and
trumpet), same timbre (trumpet on two different
tones), same phonemic identity (two different people
talking)
– The neurograms of these pairs of sounds are very
different from each other
• The generalizations should be derivable from
the neurogram, but are not explicitly represented
at that level
The Long Road from Spectrogram
to Perception
Problem no. 1: we do not hear the physics of
sounds, but rather their derived properties
(Reverse hierarchies – we perceive high
representation levels unless we make
serious efforts to go down into the details)
The Long Road from Spectrogram
to Perception
The Long Road from Spectrogram
to Perception
Problem no. 2: In natural conditions, sounds
rarely occur by themselves
We have to group and segregate ‘bits of
sounds’ in order to form representations of
‘auditory objects’
What comes first, the sound or its
properties?
• We may need to start by forming objects
(solve problem no. 2) and only later assign
properties to them (solve problem no. 1)
Hypothesis: the early auditory
system (presumably up to the
level of primary auditory cortex)
deals with the formation of
auditory objects
Evidence A:
Object representation in primary
auditory cortex
The auditory
pathways
Primary auditory cortex is a
higher brain area!
Visual system:
Auditory system:
Photoreceptors
Hair cells
Bipolar cells
Auditory nerve fibers
Retinal ganglion cells
Cochlear nucleus
Frequency
LGN
Superior Olive
V1
Inferior Colliculus
detection
MGB
Species-specific
calls?
Auditory cortex
Auditory scene
analysis?
IT
Face cells
Localization
and binaural
The auditory
pathways
A1 Neurons have a large variety of
frequency response areas (FRAs)
98
98
Memory in primary auditory cortex
Neurons in auditory cortex
represent the weak components
of sounds
(evidence for the representation of
auditory objects in primary
auditory cortex)
Strong effects of weak
backgrounds…
dB Attn
10
100
0.1
0
kHz
ms
100
40
0
ms
0
100
ms
100
Some cortical neurons respond to weak
noise in mixture with high-level tones
Tones in modulated and
unmodulated background
Weak tones in strong noise
Noise (bandwidth: BF, 10 Hz trapezoidal envelope)
Tone (BF)
Tone+Noise
Las et al. 2005
Responses to high-level tones in
silence and to low-level tones in
noise are similar
Evidence B:
coding of surprising events in
primary auditory cortex
Time
95%
Low Freq.
50%
Low Freq.
High Freq.
High Freq.
5%
High Freq.
Low Freq.
Low Freq.
Low Freq.
Low Freq.
Standard
Deviant
0.34
0.32
Low Freq.
High Freq.
SSA =
0.23
…Also with
spikes…
Evidence C:
Perceptual qualities such as pitch
are coded outside primary
auditory cortex
Activation of auditory cortex by
noise and pitched stimuli
Activation by intelligible speech
Take-home messages
• Auditory perception is far removed from
the ‘physical’, low-level representation of
sounds
• A major problem of early processing is the
definition of the ‘objects’ to which
properties will be assigned
• There is evidence that objects are defined
first, properties are assigned in higher
brain areas
Reverse Hierarchy Theory
• The hierarchical trade offs that dictate the
relations between processing and
perception
• We perceive the high-order constructs
rather than the low-level physics
Interactions between high- and lowlevel representations
Interactions between high- and lowlevel representations
Interactions between high- and lowlevel representations
From Hochstein and
Ahissar 2002
Change blindness
Name the color of the letters
נשר
אדום
כחול
Visual Reverse Hierarchy Theory
(RHT)
(Ahissar & Hochstein, 1997; Hochstein & Ahissar, 2002)
Phonological/semantic level
day bay
dream
……
Low levels are sensitive
to fine temporal cues, in
a μs resolution
Initial perception is
based on high-levels,
which represent
phonological entities
See: Nahum, Nelken and
Ahissar, PLoS 2008
We can either hear the sounds or
understand the words, but not
both at the same time