Pitch Perception - University of Limerick

Download Report

Transcript Pitch Perception - University of Limerick

Pitch Perception
Objective
• To understand the method (s) by which the
auditory system processes a sound in order
to determine its pitch.
• Audible range: 20 Hz – 20 kHz*
• The pitch of a sound refers to its perceived
tonal height and is subjective; it requires the
listener to make a perceptual judgement
• Variations in pitch create a sense of melody
Measuring pitch
• A method sometimes employed as an objective
measure of assigning a pitch to a sound:
• the listener adjusts the frequency of a sound with a
variable known frequency and similar timbre until
the pitch of both sounds are perceived as being
equal.
• This method gives the unit of Hertz (Hz) as a
measure of the pitch frequency.
• A complex sound is a sound containing more than
one frequency component.
• The sound is harmonic if the frequency
components occur at integer multiples of the
frequency of a common (though not always
present) fundamental component.
• The waveform of a harmonic sound repeats
periodically at a rate equal to the frequency of the
fundamental component.
0.1
0.05
0
-0.05
Frequency
-0.1
0.15
0.155
0.16
0.165
0.17
0.175
0.18
0.185
0.19
0.195
0.2
2000
1310
1048
786
523
262
0
0
0.1
0.2
0.3
0.4
0.5
Time
0.6
0.7
0.8
0.9
• Examples of harmonic sounds are the notes
produced from musical instruments such as the
violin, oboe and flute.
• Very clear sense of pitch
• Diagram: periodic waveform (upper) and
spectrum (lower) of an oboe playing C4
• Repetition rate of the waveform
• Fundamental component in the frequency
spectrum
Pitch of a pure tone
• If the sound is a pure tone the perceived pitch
generally corresponds to and varies with the
frequency of the tone.
• Its pitch varies also to some extent with level (asa
trk 28).
• What duration must a tone be, before it gives a
sense of pitch (i.e. from ‘click’ to ‘tone’)? (asa trk
29)
• Pitch of a sound may be influenced by the
presence of another sound (asa trk 30)
The pitch of harmonic sounds
• The pitch of a harmonic sound is found to vary
mainly with changes in the fundamental frequency
and so this is used as a measure of the pitch.
• Harmonic sounds - clearly defined sense of pitch
• Pitch of the ‘missing fundamental’
• A pitch corresponding to the fundamental
frequency may be perceived when it has been
removed from the sound (asa trk 37)
Theories of Pitch Perception
• There are two important theories of how the
auditory system is believed to code the pitch of a
sound: the place theory and the temporal theory.
• The place theory is based on the fact that different
frequency components of the input sound
stimulate different places along the basilar
membrane and in turn auditory nerve fibres with
different characteristic frequencies.
Place theory of pitch perception
• The pitch of the sound is assumed to be related to
the excitation pattern it produces on the BM.
• The pitch of a pure tone may be explained by the
position of maximum excitation.
• For a sound made up of many frequency
components, many different maxima occur along
the basilar membrane at places corresponding to
the frequencies of the components.
• The position of the overall maximum, or the
position of the maximum due to the lowest
frequency component may not correspond to the
perceived pitch of the sound.
• It is known that the pitch of a harmonic sound can
remain the same even when energy at its
fundamental frequency has been removed.
• This cannot be explained by the place theory
Temporal theory of pitch
perception
• The waveform of a sound with a strong
unambiguous pitch is periodic.
• The basis for the temporal theory of pitch
perception is the timing of neural firings, which
occur in response to vibrations on the basilar
membrane.
• Nerve firings occur at particular phases of the
waveform; a process called phase locking.
Temporal theory
• Due to phase locking the time intervals between
the successive firings occur at approximately
integer multiples of the period of the waveform.
• In this way the waveform periodicity that occurs at
each place on the basilar membrane is coded.
• At some point in the auditory system these time
intervals have to be measured.
Temporal theory
• The precision with which the nerve firings are
linked to a particular phase in the waveform
declines at high frequencies: upper limit of ~ 4-5
kHz.
• the ability to perceive pitches of sounds with
fundamental frequencies greater than 5 kHz
cannot be explained by this theory.
• It has been found that musical interval and melody
perception decreases for sounds with fundamental
frequencies greater than 5 kHz, although
differences in frequency can still be heard.
• Sounds produced by musical instruments,
the human voice and most every day sounds
have fundamental frequencies below 5 kHz.
In sum
• The auditory processing parts of the brain are
supplied with information concerning the place of
stimulation on the basilar membrane (place
theory) and neural firing patterns (temporal
theory).
• The importance of both types of information may
depend on the frequencies present and the type of
sound.
• Place coding may dominate for frequencies above
5 kHz where phase locking is reduced - below this
temporal information may be dominant.
Frequency discrimination
• The ability to detect changes in frequency over
time.
• difference limen for frequency (DLF) – the
smallest detectable change in frequency
• For a pair of pure tones the listener judges whether
the second tone is higher or lower in pitch than the
first. The DLF is the frequency separation for
which there is a certain percentage of correct
responses, e.g. 75%.
• Frequency discrimination of pure tones (asa trk
33)
Frequency discrimination
• DLF varies from person to person
• The DLF has been found to depend on frequency,
level, duration, suddenness of the frequency
change and musical training
• A large increase in DLF for frequencies > 4-5 kHz
• DLF improves with increasing duration (at least
up to 200ms, Moore 2003)
• DLF improves with increasing level.
Models of pitch perception
• Models to account for pitch perception in complex
tones
• more than one frequency component, and hearing
a pitch corresponding to the missing fundamental.
• Pattern recognition models – frequency analysis
to determine individual components present,
pattern recogniser to determine the pitch from the
components
Pattern recognition models
• The pattern recogniser attempts to find the
fundamental frequency that corresponds to the
harmonics detected – template matching process.
• Some models of this type are:
• Goldstein, J. L., 1973. “An optimum processor
theory for the central formation of the pitch of
complex tones.” J. Acoust. Soc. Am., 54(6), 14961516
Pattern recognition models
• Terhardt, E., G. Stoll and M. Seewann 1982.
“Algorithm for the extraction of pitch and pitch
salience from complex tonal signals.” J. Acoust.
Soc. Am., 71(3), 679-687
• Terhardt E., G. Stoll and M. Seewann 1982. “Pitch
of complex signals according to virtual-pitch
theory: Tests examples and predictions.” J. Acoust.
Soc. Am., 71(3), 671
Model for pitch perception in
complex tones
• Some models of pitch perception combine
both place and temporal information
• Model proposed by Moore (2003) in earlier
editions of his book
Bank of bandpass filters
Neural transduction
Analysis of spike
intervals
Combine intervals across
CFs
Most common interval
Pitch
• Bank of bandpass filters: - spectral analysis on the
input sound
• filters organised according to their centre
frequencies, representing tonotopic organisation of
frequencies on the BM
• Each filter may be thought of as representing the
frequency response of one point on the BM.
• bandwidths according to the ERB
• Neural transduction – represents mechanical to
neural transduction at the hair cell auditory nerve
fibre synapse
• output stream of spike events precisely located in
time
• to represent the signal produced by the auditory
nerve fibres
• reflects the waveform structure produced at each
point on the BM
• Analysis of intervals – periodic output of
filter channel – find period / frequencies
present in each channel
• Compare time intervals across channels –
for a harmonic sound the lowest common
interval is that of the fundamental frequency
• Graph represents waveform at 4 points –
first 4 harmonics for a 200 Hz fundamental
200, 400, 600 and 800 Hz waveforms - first 4 harmoincs
0.5
0
-0.5
0
0.005
0.01
0.015
0.02
0.025
0.005
0.01
0.015
0.02
0.025
0.005
0.01
0.015
0.02
0.025
0.005
0.01
0.015
0.02
0.025
0.5
0
-0.5
0
0.0025
0.5
0
-0.5
0
0.0016
0.5
0
-0.5
0 0.0013
time (s)
Harmonic Period s
Hz
200
0.005
400
0.0025
0.005
600
0.00167 0.0033 0.005
800
0.00125
0.0025
0.00375 0.005
• Intervals between successive firings for a complex
sound consisting of the frequency components
200, 400, 600 and 800 Hz.
• Intervals between successive nerve firings indicate
the period of each individual harmonic, when
sufficiently resolved.
• Lowest common time interval at 0.005s –
corresponds to the 200 Hz fundamental frequency
• Computer implementation of the above
model:
• Meddis, R. and L. O’Mard. 1997. “A
unitary model of pitch perception.” J.
Acoust. Soc. Am., 102 (3), 1811-1820
Pitch organisation in WTM
• The pitches of notes in WTM are most often tuned
according to the system of equal temperament.
• Equal tempered tuning was formed out of a
requirement for equally spaced intervals in terms
of frequency ratio regardless of tonality (the
musical key).
• Two notes are an octave apart if their frequencies
are in the ratio 2:1.
• In equal tempered tuning the octave is divided into
twelve equal logarithmic steps called semitones.
• The fundamental frequency of adjacent semitones
differs by a factor of 21/12.
• The semitone may be subdivided into ‘cents’.
• There are 100 cents in a semitone and therefore
1200 cents in an octave.
• The ratio of all equal tempered musical intervals
(except for the unison and the octave) match
approximately to the ratio of the corresponding
pure tone interval in the harmonic series; the
octave and the unsion match exactly
• Only the octave and unison intervals can be
described in terms of small integer ratios.
• All other intervals are tempered slightly
from the small integer ratio (e.g. the fifth is
tempered slightly less than pure, the fourth
is slightly greater than pure and the major
third is greater than pure).
Pitch in two dimensions
• Pitch perception in music is often thought of in
two dimensions, pitch height and pitch chroma
(Shepard, 1964). (asa trk 52)
• This is to account for the perceived similarity of
pitches that are separated by octaves.
• Pitch height is the low / high dimension of pitch.
• The relative position of a pitch within a given
octave is referred to as its chroma.
• In Western music theory this is indicated by
the octave equivalent pitch classes, C, C#
etc.
• Music notes are identified first by their
position within the octave, their chroma,
and then by the octave in which they are
placed (e.g. G3, F#6).