Hearing Complex Sounds

Download Report

Transcript Hearing Complex Sounds

Hearing Complex Sounds
MSc Neuroscience
Jan Schnupp
[email protected]
Sound Signals
Many physical objects emit sounds when they are
“excited” (e.g. hit or rubbed).
Sounds are just pressure waves rippling through the
air, but they carry a lot of information about the
objects that emitted them.
(Example: what are these two objects? Which one is
heavier, object A
or object B
?)
The sound (or signal) emitted by an object (or system)
when hit is known as the impulse response.
Impulse responses of everyday objects can be quite
complex, but the sine wave is a fundamental ingredient
of these (or any) complex sounds (or signals).
Vibrations of a SpringMass System
Undamped
1. F = -k·y (Hooke’s Law)
2. F = m · a (Newton’s 2nd)
3. a = dv/dt = d2y/dt2
 -k · y = m · d2y/dt2
 y(t) = yo · cos(t · k/m)
Damped
4. –k·y –r dy/dt = d2y/dt2
y(t) = yo·e(-r·t/2m)cos(t·k/m-(r/2m)2)
Don’t worry about the formulae! Just
remember that mass-spring systems
like to vibrate at a rate proportional to
the square-root of their “stiffness” and
inversely proportional to their weight.
http://auditoryneuroscience.com/acoustics/simple_harmonic_motion
Resonant Cavities
In resonant cavities, “lumps of air” at the entrance/exit of the
cavity oscillate under the elastic forces exercised by the air inside
the cavity.
The preferred resonance frequency is inversely proportional to the
square root of the volume. (Large resonators => deeper sounds).
The Ear
Organ of Corti
Cochela “unrolled” and sectioned
Modes of Vibration and Harmonic Complex Tones
An “ideal” string would maintain the
triangular shape set up when the
string is plucked throughout the
oscillation.
Fourier analysis approximates
such vibrations as a sum (series) of
sine waves. For large numbers of
sine waves (“Fourier components”)
the approximation becomes very
good, for an infinite number of
components it is exact.
Natural sounds are often made up of
sine components that are
harmonically related i.e. their
frequencies are all integer multiples
of a “fundamental frequency”.
1
sin( 3  x )...
4
1
1
 sin( 5  x )  sin( 7  x )...
8
16
1
1

sin( 9  x ) 
sin( 11 x )  ...
32
64
y  sin( x   ) 
http://auditoryneuroscience.com/?q=acoustics/modes_of_vibration
sound pressure
Click Trains & the “30 Hz Transition”
time
At frequencies up to ca 30 Hz, each click in a click train is
perceived as an isolated event.
At frequencies above ca 30 Hz, individual clicks fuse, and
one perceives a continuous hum with a strong pitch.
https://mustelid.physiol.ox.ac.uk/drupal/?q=pitch/click_train
The Impulse (or “Click”)
The “ideal click”, or impulse, is an infinitesimally short signal.
The Fourier Transform encourages us to think of this click as an
infinite series of sine waves, which have started at the beginning
of time, continue until the end of time, and all just happen to pile
up at the one moment when the click occurs.
Basilar Membrane Response to Clicks
http://auditoryneuroscience.com/ear/bm_motion_2
Why Click Trains Have Pitch
If we represent each click in a train by its Fourier
Transform, then it becomes clear that certain sine
components will add (top) while others will cancel
(bottom). This results in a strong harmonic structure.
Basilar Membrane Response to Click
Trains
auditoryneuroscience.com | The Ear
AN Phase Locking to Artificial
“Single Formant” Vowel Sounds
Phase locking
to Modulator
(Envelope)
Phase locking
to Carrier
Cariani &
Delgutte AN
recordings
https://mustelid.physiol.ox.ac.uk/drupal/?q=ear/bm_motion_3
Vocal Folds in Action
auditoryneuroscience.com | Vocalizations
Articulation
Articulators (lips, tongue, jaw, soft palate) move to change
resonance properties of the vocal tract.
auditoryneuroscience.com | Vocalizations
Launch Spectrogram
Harmonics & formants of a vowel
Formants
Harmonics
Formants Arise From Resonances in the
Vocal Tract
Speakers change formant frequencies by making
resonance cavities in their mouth, nose and throat
larger or smaller.
The
“Neurogram”
As a crude approximation,
one might say that it is the
job of the ear to produce a
spectrogram of the
incoming sounds, and that
the brain interprets the
spectrogram to identify
sounds.
This figure shows
histograms of auditory
nerve fibre discharges in
response to a speech
stimulus. Discharge rates
depend on the amount of
sound energy near the
neuron’s characteristic
frequency.
Phase Locking
The discharges of cochlear nerve fibres to lowfrequency sounds are not random;
they occur at particular times (phase locking).
https://mustelid.physiol.ox.ac.uk/drupal/?q=ear/phase_locking
Evans (1975)
Quantifying Phase-Locking
Phase locking is usually measured as a “vector strength”, also
known as “synchronicity index” (SI).
To calculate this, a Period Histogram of the neural response is
normalized, and each bin of the histogram is thought of as a
vector.
The SI is given by length of the vector sum for all bins.
Phase Locking Limits
Hair Cell Receptor Potentials
AN fibre Responses
2000 Hz
3000 Hz
Synchrony Index
1000 Hz
4000 Hz
Frequency (kHz)
AN fibres are unable to phase lock to signal fluctuations at
rates faster than 3 kHz.
This is due to low-pass filtering in the hair cell receptors.
Frequency Coding
The Place Theory stipulates that frequencies are encoded
by activity across the tonotopic array of fibers in the AN, as
well as in tonotopic nuclei along the lemniscal auditory
pathway.
The Timing Theory posits that temporal information
conveyed through phase locking provides the dominant cue
to frequency information.
Neither the Place nor the Timing Theory can account for all
psychophysical data
Missing
Fundamental
Sounds
http://auditoryneuroscience.com/topics/missing-fundamental
The Pitch of “Complex” Sounds
(Examples)
am tones
3500
3500
3000
3000
2500
2500
Frequency
Frequency
pure tone
2000
1500
1500
1000
1000
500
500
0
1
2
Time
3
0
4
1
2
Time
3
3500
iterated rippled (comb filtered) noise
3500
3000
3000
2500
2500
Frequency
click trains
Frequency
2000
2000
1500
2000
1500
1000
1000
500
500
0
1
2
Time
3
0
1
2
Time
3
The Periodicity of a Signal is the Major
Determinant of its Pitch


Iterated rippled noise can be made more or less periodic by
increasing or decreasing the number of iterations. The less
periodic the signal, the weaker the pitch.
Sinusoidally Amplitude Modulated
(SAM) Tones
Example with carrier frequency c:5000 Hz, modulator frequency f:400 Hz
Make a sinusoidally modulated tone by multiplying carrier with
modulator:
sin(2t c) · (0.5 · sin(2t m) + 0.5)
or by adding sine components of frequencies c+ m, c - m
0.5 · sin(2t c) + 0.25 · sin(2t (c+m)) + 0.25 · sin(2t (c-m))
(!) The SAM tone has no energy at the modulation frequency.
Nevertheless the modulator influences the perceived pitch.
AN Phase Locking to SAM Tones
Spontaneous rate
High
Medium
Low
Sound
level
High
Low
Medium SR fibers phase lock better to amplitude modulation than high or
low SR fibers.
The ability to phase lock to amplitude modulation declines with high
sound levels.
Cell Types of the
Cochlear Nucleus
Spherical
Bushy
AVCN
Stellate
DCN
Fusiform
PVCN
Globular
Bushy
Stellate
Octopus
CN Phase
Locking to
Amplitude
Modulations
Certain cell
types in the CN
(including Onset
and some
Chopper types)
exhibit much
stronger phase
locking to AM
than is seen in
their AN inputs.
Encoding of Envelope
Modulations in the Midbrain
Neurons in the midbrain or above show much
less phase locking to AM than neurons in the
brainstem.
Transition from a timing to a rate code.
Some neurons have bandpass MTFs and exhibit
“best modulation frequencies” (BMFs).
Topographic maps of BMF may exist within
isofrequency laminae of the ICc,
(“periodotopy”).
Periodotopic maps via fMRI
Baumann et al Nat Neurosci
2011 described periodotopic
maps in monkey IC obtained
with fMRI.
They used stimuli from 0.5
Hz (infra-pitch) to 512 Hz
(mid-range pitch).
Their sample size is quite
small (3 animals – false
positive?)
The observed orientation of
their periodotopic map
(medio-dorsal to lateroventral for high to low)
appears to differ from that
described by Schreiner &
Langner (1988) in the cat
(predimonantly caudal to
rostral)
Proposed Periodotopy
in Gerbil A1
SAM tones should only
activate the high
frequency parts of the
tonotopically organized
A1.
However, activity
(presumed to
correspond to the low
pitch of these signals) is
also seen in the low
frequency parts of A1.
This activity is thought
to be organized in a
concentric periodotopic
map.
However...
Periodotopy
inconsistent in
ferret cortex
SAM tones
hp Clicks
hp IRN
animal 1
animal 2
Nelken, Bizley, Nodal, Ahmed, Schnupp, King (2008) J. Neurophysiol 99(4)
A pitch
area in
primates?
In marmoset,
Pitch sensitive
neurons are most
commonly found
on the boundary
between fields A1
and R.
Fig 2 of Bendor &
Wang, Nature
2005
A pitch sensitive neuron in marmoset A1?
Apparently pitch sensitive neurons in marmoset A1.
Fig 1 of Bendor & Wang, Nature 2005
Artificial Vowel Sounds
/a/
/e/
/u/
/i/
dB
50
200 Hz
0
-50
-100
dB
50
336 Hz
0
-50
-100
dB
50
565 Hz
0
-50
-100
dB
50
951 Hz
0
-50
-100
0
5000
Hz
10000
0
5000
Hz
10000
0
5000
Hz
10000
0
5000
Hz
10000
Responses to Artificial Vowels
Bizley et al J Neurosci 2009
Joint Sensitivity to Formants and Pitch
Pitch (Hz)
Vowel type (timbre)
Bizley, Walker, Silverman, King & Schnupp - J Neurosci 2009
Mapping cortical sensitivity to sound features
Timbre
Nelken et al., J Neurophys, 2004
Neural
sensitivity
Bizley et al., J Neurosci, 2009
Summary
Periodic Signals (click trains, harmonic complexes) with periods
between ca. 30-3000 Hz tend to have a strong pitch.
Aperiodic signals (noises, isolated clicks) have “weak” pitches or
no pitch at all.
Speech sounds include periodic (vowels, voiced consonants) and
aperiodic (unvoiced consonants) sounds.
There are competing place and timing (temporal) theories of
pitch.
Place theories depend on the representation of spectral peaks in
tonotopic parts of the auditory pathway. Some important features
(e.g. formants of speech sounds) are represented tonotopically,
but place cannot represent harmonic structure well.
Timing theories postulate that early stages of the auditory system
measures spike intervals in phase locked discharges.
In midbrain and cortex pitch appears to be represented through
rate codes. Whether there are “periodotopic maps” or specialized
pitch areas in the central auditory system is highly controversial.