Spatial Hearing
Download
Report
Transcript Spatial Hearing
PSYC 60041 Auditory Science
Spatial Hearing
Chris Plack
Spatial Hearing
• Learning Outcomes
– Understand the main cues used to localise sounds:
IIDs and ITDs
– Understand other cues to localisation including head
movements and pinna effects
– Understand the BMLD
– Understand what is meant by “the precedence effect”
and how it is measured
Some Definitions
Binaural hearing - hearing with both ears.
Localisation - the ability to locate a sound source in the
environment.
Lateralisation - the ability to localise in the head a
sound presented over headphones.
Localisation
Localisation
Our ears give us much less information about object
location than our eyes.
We have only got two spatial channels for hearing (our
two ears) compared to arguably several million for
vision (the receptors in each retina).
However, we can hear sound sources that are beyond
our line of sight (e.g., behind our heads), and this
helps us to orient attention and can be important
for survival.
Binaural Cues
Our two ears can be used to localise sounds.
A sound will tend to be more intense, and arrive earlier,
at the ear closest to the sound source.
Hence, we can determine the direction of a sound
source based on:
Interaural Intensity Differences (IIDs - also called
ILDs, interaural level diferences)
Interaural Time Differences (ITDs)
Intensity Cues
Mainly because of the shadowing effect of the head, a sound to the
right will be more intense in the right ear than in the left ear:
L
R
Interaural Intensity Differences
The sound reaching the ear farthest from the source is
less intense due to head shadowing mainly, and
also to dissipation of intensity with distance
according to inverse-square law (only useful for
sounds close to head).
Low-frequency sounds diffract around the head, highfrequency sounds don't, and thus a highfrequency shadow is cast over the farthest ear.
Hence the IID is frequency dependent, being greatest
for high frequencies.
Timing Cues
Because sound travels relatively slowly (330 m/s), a sound from the
right will arrive perceptibly earlier at the right ear than at the
left ear:
L
R
Time 1
L
R
Time 2
Interaural Time Differences
The interaural distance (approx. 23 cm) produces a
maximum ITD of about 0.69 ms when the source is
directly opposite one ear (90o).
The ITD falls to zero as the source moves forward or
backwards to be in front (0o) or behind (180o) the
listener.
Smallest detectable ITD is about 0.01 ms!
Ambiguities in ITDs
For a continuous pure tone, ambiguities arise when the
period of the tone is less than twice the ITD closest peaks in the waveform may suggest wrong
ear is leading.
For a sound directly to the side, this occurs for a
frequency of about 735 Hz.
Ambiguities can be resolved if the tone is modulated,
i.e., if there are envelope cues (including abrupt
onsets).
Ambiguities in ITDs
Duplex Theory
The duplex theory suggests that sound localisation is
based on interaural time differences at low
frequencies and interaural intensity differences at
high frequencies.
However, for fluctuating high-frequency signals the
envelope can carry good timing information.
It is now thought that, for most sounds (which have
wideband spectra), ITDs may dominate at all
frequencies.
Minimum Audible Angle
Indicates the smallest change in sound source position
that can be detected by the listener.
Using sinusoidal signals, the MAA is smallest for
frontal signals (1o for frequencies below 1 kHz).
Around 1.5 kHz the IIDs are small and ITDs become
ambiguous resulting in an increase in MAA.
Performance worsens markedly as the source moves
away from a frontal position, but in the real world
the listener can move their head!
The Cone of Confusion
Interaural time and intensity differences are ambiguous.
Same IIDs and ITDs
for sound source on
surface of cone:
For example, we can’t tell the difference between a
sound directly in front and a sound directly behind
using IIDs or ITDs.
The Cone of Confusion
Ambiguities can be resolved by:
Head movements
Spectral effects of pinna, head, and torso reflections
Head Movements
a) ITD = 0:
?
b) ITD shows
left leading:
?
…but what if sound is too brief?
Effects of Pinna
The pinna modifies sound entering ear depending on
direction, resolving ambiguities and providing cues
to source elevation:
1 5 ° Elevat ion
-1 5 ° Elevat ion
15
°
10 dB
15°
3
4
5
6
7
Frequency (kHz)
8
9
10
Effects of Pinna
Because of shape of concha, sounds at higher
elevations have shorter reflected path lengths,
hence notch at a higher frequency:
Hebrank & Wight (1974) JASA 56, p. 1829
Evidence for Importance of Spectral
Cues in Vertical Localisation
Accurate vertical localisation only with broadband
signals (and only those with energy > 4 kHz).
Vertical localisation prevented by occlusion of the
convolutions in the pinnae (horizontal localisation
unaffected apart from front/back distinctions).
Vertical localisation almost as good with single ear.
Vertical localisation sensitive to manipulations of the
source spectrum.
Middlebrooks & Green (1991) Ann. Rev. Psychol. 42, p. 135
Distance Perception
Loudness is an important cue for distance: In the direct
field (little reverberation) the further away the
source is, the quieter the sound.
Better with familiar sounds (e.g. speech).
Direct-to-reverberant energy ratio is another cue: The
closer the sound the louder the direct sound will
be compared with the early reflected sounds.
e.g. Zahorik (2002) JASA 111, p 1832
Binaural Unmasking
Binaural Masking Level Difference
Measure tone (signal) threshold in presence of
broadband masker with identical signal and
masker to both ears.
Invert phase of tone (or masker) in one ear so that
signal and masker are lateralised differently.
Masked signal threshold is lower (binaural release from
masking).
The difference between in-phase and altered-phase
thresholds is called the BMLD.
• NoSo - masker & signal
same phase at both ears poor detection
• NoSπ - masker same
phase, signal π radians out
of phase - good detection
• NmSm - masker and signal
presented monaurally poor detection
• NoSm - masker same to
both ears, signal monaural
- good detection
The BMLD is frequency dependent since it relies on ITDs:
Condition
BMLD (dB)
NuS
NuSo
NSm
NoSm
NSo
NoS
3
4
6
9
13
15
N = noise masker, S = signal, u = uncorrelated noise, o
= no phase shift, m = monaural, = 180° phase shift
Huggins Pitch
Present the same noise to both ears over headphones noise is lateralised to the centre of the head.
Now decorrelate a narrow band of noise between the
ears (so that the band is different between the
ears).
This band “pops out” and is heard as having a pitch
corresponding to the centre frequency of the
band: Huggins pitch.
Huggins Pitch
Decorrelated
between the ears
Same in both ears
NOISE
500
Frequency (Hz)
Gockel, Carlyon, and Plack (2010). Can Huggins pitch harmonics
be combined with diotic pure tone harmonics to produce a
residue pitch?
Mixed-mode conditions (1 HP + 1 NBN):
Single-mode conditions (2 HP or 2 NBN):
Present two successive pairs of harmonics. Does pitch change
follow analytic (spectral) or synthetic (residue) pitch?
Frequencies
F0 (Hz)
Harmonic Numbers
400 + 800
400
1st & 2nd
533 + 800
267
2nd & 3rd
640 + 800
160
4th & 5th
Response of listeners for mixed-mode and single-mode
conditions highly correlated:
Suggests Huggins and diotic harmonics are processed by the
same mechanism and combined after MSO.
The Precedence Effect
The Precedence Effect
In a reverberant space (such as a room) sound from a
source reflects off the walls, and arrives at the ear
from different directions.
Why don’t these reflections confuse the auditory
system?
The Precedence Effect
Direct sound follows shortest path and arrives first.
The auditory system takes advantage of this by
restricting analysis to the sound arriving first. I.e.
the first arriving wavefront takes precedence.
The Precedence Effect
For example, click from two loudspeakers separated by
80o. If click is simultaneous, then heard between
loudspeakers.
Delay imposed on left loudspeaker. For 0-1 ms delay,
sound image moves to right loudspeaker. For
delays of 1-30 ms, image localised at right
loudspeaker with no contribution from left
(precedence effect).
For delays > 30-35 ms effect breaks up, and a direct
sound and echo are heard.
Virtual Auditory Space
Virtual Auditory Space
Sounds presented over headphones tend to be
lateralised inside the head.
However, if we record sounds using two microphones
in the ear canals (or in the ear canals of a “dummy
head”) then when this recording is presented over
headphones it seems external and can be
localised outside the head.
The cues from the pinna, head, and torso help to give a
recording a spacious quality when presented over
headphones.
Dummy Head Recordings
Spatial Hearing
• Learning Outcomes
– Understand the main cues used to localise sounds:
IIDs and ITDs
– Understand other cues to localisation including head
movements and pinna effects
– Understand the BMLD
– Understand what is meant by “the precedence effect”
and how it is measured