Transcript Chapter 12:
Chapter 12:
Sound Localization and the
Auditory Scene
Overview of Questions
• What makes it possible to tell where a sound
is coming from in space?
• When we are listening to a number of musical
instruments playing at the same time, how
can we perceptually separate the sounds
coming from the different instruments?
• Why does music sound better in some
concert halls than in others?
Auditory Localization
• Auditory space - surrounds an observer and
exists wherever there is sound
• Researchers study how sounds are localized
in space by using:
– Azimuth coordinates - position left to right
– Elevation coordinates - position up and
down
– Distance coordinates - position from
observer
Figure 12.1 The three directions used for studying sound localization: azimuth (left-right); elevation (updown); and direction.
Auditory Localization - continued
• On average, people can localize sounds
– Directly in front of them most accurately
– To the sides and behind their heads least
accurately.
• Location cues are not contained in the
receptor cells like on the retina in vision; thus,
location for sounds must be calculated.
Figure 12.3 Comparing location information for vision and hearing. Vision: The bird and the cat are at
different locations and are imaged on different places on the retina. Hearing: The frequencies in the sounds
from the bird and the cat are spread out over the cochlea, with no regard to the locations of the bird and the
cat.
Cues for Sound Location
• Binaural cues - location cues based on the
comparison of the signals received by the left
and right ears
– Interaural time difference (ITD)- difference
between the times sounds reach the two
ears
• When distance to each ear is the same,
there are no differences in time.
• When the source is to the side of the
observer, the times will differ.
Figure 12.4 The principle behind interaural time difference (ITD). The tone directly in front of the listener, at
A, reaches the left and the right ears at the same time. However, when the tone is off to the side, at B, it
reaches the listener’s right before it reaches the left ear.
Binaural Cues
– Interaural level difference (ILD)- difference
in sound pressure level reaching the two
ears
• Reduction in intensity occurs for high
frequency sounds for the far ear.
–The head casts an acoustic shadow.
• This effect doesn’t occur for low
frequency sounds.
Figure 12.5. Why interaural level difference (ILD) occurs for high frequencies but not for low frequencies. (a)
When water ripples are small compared to an object, such as this boat, they are stopped by the object. (b)
The spaces between high-frequency sound waves is small compared to the head. The head interferes with
the sound waves, creating an acoustic shadow on the other side of the head. (c) The same ripples are large
compared to the single cattail, so they are unaffected by it. (d) The spacing between low-frequency sound
waves is large compared to the person’s head, so the sound is unaffected by the head.
Figure 12.6 The three curves indicate interaural level difference (ILD) as a function of frequency for three
different sound source locations. Note that the difference in ILD for different locations is higher at high
frequencies (Adapted from Hartmann, 1999).
Monaural Cue for Sound Location
• The pinna and head affect the intensities of
frequencies.
• Measurements have been performed by
placing small microphones in ears and
comparing the intensities of frequencies with
those at the sound source.
– This is a spectral cue since the information
for location comes from the spectrum of
frequencies.
Judging Elevation
• ILD and ITD are not effective for judgments
on elevation since in many locations they
may be zero.
• Experiment investigating spectral cues
– Listeners were measured for performance
locating sounds differing in elevation.
– They were then fitted with a mold that
changed the shape of their pinnae.
Experiment on Judging Elevation
– Right after the molds were inserted,
performance was poor for elevation but
was unaffected for azimuth.
– After 19 days, performance for elevation
was close to original performance.
– Once the molds were removed,
performance stayed high.
– This suggests that there might be two
different sets of neurons—one for each set
of cues.
Figure 12.9 How localization changes when a mold is placed in the ear. See text for explanation. (Hofman
et al., 1998).
The Physiological Representation of
Auditory Space
• Two mechanisms have been proposed
– Narrowly tuned ITD neurons
• They are found in the inferior colliculus
and the superior olivary nuclei.
• This response is a form of specificity
coding.
Figure 12.10 ITD tuning curves for six neurons that each respond to a narrow range of ITD’s. The neurons
on the left respond when sound reaches the left ear first. The ones on the right respond when sound
reaches the right ear first. Neurons such as this have been recorded from the barn owl and other animals
(Adapted from McAlpine, 2005).
The Physiological Representation of
Auditory Space - continued
• Jeffress Model for narrowly tuned ITD
neurons
– These neurons receive signals from both
ears.
– Coincidence detectors fire only when
signals arrive from both ears
simultaneously.
– Other neurons in the circuit fire to locations
corresponding to other ITDs.
Figure 12.11 How the Jeffress circuit operates. Axons transmit signals from the left ear (blue) and right ear
(red) to neurons, indicated by circles. (a) Sound in front: signals start in left and right channels
simultaneously. (b) Signals meet at neuron 5, causing it to fire. (c) Sound to the right: signal starts in the
right channel first. (d) Signals meet at neuron 3, causing it to fire.
The Physiological Representation of
Auditory Space - continued
• Broadly-tuned ITD neurons
– Research on gerbils indicates that neurons
in the left hemisphere respond best to
sound from the right, and vice versa.
– Location of sound is indicated by the ratio
of responding for two types of neurons.
– This is a distributed coding system.
Figure 12.12 (a) ITD tuning curves for broadly-tuned neurons. The left curve represents the tuning of
neurons in the right hemisphere; the right curve is the tuning of neurons in the left hemisphere; (b) Patterns
of response of the broadly tuned curves for stimuli coming from the left, in front, and the right. Neurons such
as this have been recorded from the gerbil (Adapted from McAlpine, 2005).
Identifying Sound Sources
• Auditory Scene - the array of all sound
sources in the environment
• Auditory Scene Analysis - process by which
sound sources in the auditory scene are
separated into individual perceptions
• This does not happen at the cochlea since
simultaneous sounds are together in the
pattern of vibration of the basilar membrane.
Figure 12.13 Each musician produces a sound stimulus, and all three sounds are combined in the output of
the loudspeaker.
Principles of Auditory Grouping
• Heuristics that help to perceptually organize
stimuli
– Onset time - sounds that start at different
times are likely to come from different
sources
– Location - a single sound source tends to
come from one location and to move
continuously
– Similarity of timbre and pitch - similar
sounds are grouped together
Auditory Stream Segregation
• Compound melodic line in music is an
example of auditory stream segregation.
• Experiment by Bregman and Campbell
– Stimuli were alternating high and low tones
– When stimuli played slowly, the perception
is hearing high and low tones alternating.
– When the stimuli are played quickly, the
listener hears two streams; one high and
one low.
Figure 12.14 Four measures of a composition by J. S. Bach (Chorale Prelude on Jesus Christus unser
Heiland, 1739). When played rapidly, the upper notes become perceptually grouped and the lower notes
become perceptually grouped, a phenomenon called auditory stream segregation.
Figure 12.15 (a) When high and low tones are alternated slowly, auditory stream segregation does not
occur, so the listener perceives alternating high and low tones. (b) Faster alternation results in segregation
into high and low streams.
Auditory Stream Segregation - continued
• Experiment by Bregman & Rudnicky
– Listener hears two standard tones (X & Y)
and can easily perceive the order.
– Then two distractor (D) tones are placed
between X & Y and the listener can no
longer perceive the order.
– Adding a series of captor tones (C) forms a
stream with the D tones, and the listener
can once again hear the order of X & Y.
Figure 12.16 Bregman and Rudnicky’s (1975) experiment. (a) The standard tones X and Y have different
pitches. (b) The distractor (D) tones group with X and Y, making it difficult to judge the order of X and Y. (c)
The addition of captor (C) tones with the same pitch as the distractor tones causes the distractor tones to
form a separate stream (grouping by similarity), making it easier to judge the order of tones X and Y (Based
on Bregman & Rudnicky, 1975).
Figure 12.17 (a) Two sequences of stimuli: a series of similar notes (red), and a scale (blue). (b) Perception
of these stimuli: Separate streams are perceived when they are far apart in frequency, but when the
frequencies are in the same range, the tones appear to jump back and forth between stimuli.
Auditory Stream Segregation - continued
• Experiment by Deutsch - the scale illusion or
melodic channeling
– Stimuli were two sequences alternating
between the right and left ears.
– Listeners perceive two smooth sequences
by grouping the sounds by similarity in
pitch.
– This demonstrates the perceptual heuristic
that sounds with the same frequency come
from the same source, which is usually true
in the environment.
Figure 12.18 (a) These stimuli were presented to a listener’s left ear (blue) and right ear (red) in Deutsch’s
(1975) scale illusion experiment. Notice how the notes presented to each ear jump up and down. (b) What
the listener hears. Although the notes in each ear jump up and down, the listener perceived a smooth
sequence of notes. This effect is called the scale illusion, or melodic channeling. (Adapted from Deutch,
1975).
Principles of Auditory Grouping - continued
• Proximity in time - sounds that occur in rapid
succession usually come from the same
source
– This principle was illustrated in auditory
streaming.
• Auditory continuity - sounds that stay
constant or change smoothly are usually from
the same source
Good Continuation
• Experiment by Warren et al.
– Tones were presented interrupted by gaps
of silence or by noise.
– In the silence condition, listeners perceived
that the sound stopped during the gaps.
– In the noise condition, the perception was
that the sound continued behind the noise.
Figure 12.19 A demonstration of auditory continuity, using tones.
Principles of Auditory Grouping - continued
• Effect of past experience
– Experiment by Dowling
• Melody “Three Blind Mice” is played with
notes alternating between octaves
• Listeners find it difficult to identify the
song
• But after they hear the normal melody,
they can then hear it in the modified
version using melody schema
Figure 12.20 “Three Blind Mice”: (a) jumping octave version; (b) normal version.
Hearing Inside Rooms
• Direct sound - sound that reaches the
listener’s ears straight from the source
• Indirect sound - sound that is reflected off of
environmental surfaces and then to the
listener
• When a listener is outside, most sound is
direct; however inside a building, there is
direct and indirect sound.
Figure 12.21 (a) When you hear a sound outside, you hear mainly direct sound (path a). (b) When you hear
a sound inside a room, you hear both direct (a) and indirect sound (b, c, and d) that is reflected from the
walls, floor, and ceiling of the room.
Experiment by Litovsky et al.
• Listeners sat between two speakers: a lead
speaker and a lag speaker.
• When sound comes from the lead speaker
followed by the lag speaker with a long delay,
listeners hear two sounds.
• When the delay is decreased to 5 - 20 msec,
listeners hear the sound as only coming from
the lead speaker - the precedence effect.
Figure 12.22 (a) When sound is first presented first in one speaker and then the other, with enough time
between them, they are heard separately, one after another. (b) If there is only a short delay between the
two sounds, then the sound is perceived to come from the lead speaker. This is the precedence effect.
Architectural Acoustics
• The study of how sounds are reflected in
rooms.
• Factors that affect perception in concert halls.
– Reverberation time - the time is takes
sound to decrease by 1/1000th of its
original pressure
• If it is too long, sounds are “muddled.”
• If it is too short, sounds are “dead.”
• Ideal times are around two seconds.
Factors that Affect Perception in Concert
Halls
– Intimacy time - time between when sound
leaves its source and when the first
reflection arrives
• Best time is around 20 ms.
– Bass ratio - ratio of low to middle
frequencies reflected from surfaces
• High bass ratios are best.
– Spaciousness factor - fraction of all the
sound received by listener that is indirect
• High spaciousness factors are best.
Acoustics in Classrooms
• Ideal reverberation time in classrooms is
– .4 to .6 second for small classrooms.
– 1.0 to 1.5 seconds for auditoriums.
– These maximize ability to hear voices.
– Most classrooms have times of one second
or more.
• Background noise is also problematic.
– Signal to noise ratio should be +10 to +15
dB or more.
Interactions Between Vision and Sound
• Visual capture or the ventriloquist effect - an
observer perceives the sound as coming from
the visual location rather than the source for
the sound
• Experiment by Sekuler et al.
– Balls moving without sound appeared to
move past each other.
– Balls with an added “click” appeared to
collide.
Figure 12.24 Two conditions in the Sekuler et al. (1999) experiment showing successive positions of two
balls that were presented so they appeared to be moving. (a) No sound condition: the two balls were
perceived to pass each other and continue moving in a straight-line motion. (b) Click added condition:
Observers were likely to see the balls as colliding.