unit1sup - University of Kentucky
Download
Report
Transcript unit1sup - University of Kentucky
Audio Scene Analysis and Music
Cognitive Elements of Music Listening
Kevin D. Donohue
Databeam Professor
Electrical and Computer Engineering
University of Kentucky
What is Music?
1 a : the science or art of ordering tones or
sounds in succession, in combination, and
in temporal relationships to produce a
composition having unity and continuity b :
vocal, instrumental, or mechanical sounds
having rhythm, melody, or harmony
Merrian-Webster Online Dictionary:
http://www.m-w.com/dictionary/music
Auditory Scene: Input
Sensory organs (ears) separate acoustic energy into
frequency bands and convert band energy into neural
firings
The auditory cortex receives the neural responses and
abstracts an auditory scene.
1
2
0
0.0
5
3
0.1
Time
4
http://hyperphysics.phy-astr.gsu.edu/hbase/sound/hearcon.html
Frequency
Auditory Scene: Perception
Perception derives a useful representation of reality
from sensory input.
Auditory Stream refers to a perceptual unit associated
with a single happening (A.S. Bregman, 1990) .
Acoustic to
Neural
Conversion
Organize into
Auditory
Streams
Representation
of Reality
Auditory Stream Experiment
Bergman & Campbell (1971)
Streams tend to form by grouping notes close in time and frequency
(similarity and proximity).
Click on spectrograms to play tone sequence. Identify changes in tone
grouping based on separation in time and frequency.
http://www.psych.mcgill.ca/labs/auditory/demo3.html
http://www.psych.mcgill.ca/labs/auditory/demo2.html
Note change in grouping/phrasing
from inserting a pair of closely
spaced tones around the lower tone.
Circularity in Pitch Judgement
Shepard’s Scale (1964)
(Auditory Demonstrations CD, from the Acoustical Society of America)
Perceptual Organization
Organization properties:
Belongingness – a sensory element belongs to an
organization (or stream) of which is a part.
Exclusive allocation – a sensory element cannot belong to
more than one organization at a time.
Bregman & Rudnicky (1975)
Click on spectrogram to listen to tone sequence. Note in
first case the later tonal group sounds as one stream due to
time proximity. In the second case flanking the lower tones
with a sequence at same frequency, separates the lower tone
from the upper tones creating 2 separate streams.
Perceptual Organization
Organization properties:
Closure – perceived continuity, a tendency to close strong
perceptual forms, response to missing evidence.
Click on time waveform plots to listen. In the first case a
low level tone is playing and then stops, but the gap is
covered by a white noise mask. Most will hear the tone
playing through the mask.
Tone pattern first spectrogram
White noise only, used in masking
Sequential and Spectral Integration
Sequential Integration
Grouping sensory elements over time or events at
different times and considered as from the same source.
Melody, rhythm
Spectral Integration
Fusing simultaneous sensory elements over frequency
into one
Timbre, harmony
Timbre and Spectral Integration
The time harmonic structure (spectral envelope) and time envelope give rise
the timbre of the sound.
Click on spectra to hear sound. Note Impact of spectral and time envelopes
0.5
Amplitude
0
dB
-20
-40
-60
0
2000
4000
6000
Hertz
8000
10000
-0.5
12000
dB
-20
-40
0
2000
4000
6000
Hertz
8000
10000
0.2
0.4
0.6
Seconds
0.8
1
0
0.2
0.4
0.6
Seconds
0.8
1
0
0.2
0.4
0.6
Seconds
0.8
1
0.5
0
-0.5
-1
12000
0
Amplitude
1
dB
-20
-40
-60
0
1
Amplitude
0
-60
0
0
2000
4000
6000
Hertz
8000
10000
12000
0.5
0
-0.5
-1
Timbre and Spectral Integration
Simultaneous tones grouped by timbre
Click on spectrograms to play sounds. Note that different spectral
bands do not sound like different streams. Just one stream is heard.
2 Notes (F and A)
5000
5000
4000
4000
3000
3000
Hertz
Hertz
Same Note (A)
2000
2000
1000
1000
0
0.1
0.2 0.3
Seconds
0.4
0
0.1
0.2 0.3
Seconds
0.4
Auditory Scene Organization
Primitive Stream Segregation
Inherent constraints in auditory scene analysis (perceptual organization
demonstrated by infants/children)
Music: Organization of musical sensory units
Schema-based segregation
Learned constraints in auditory scene analysis (differences in perceptual
organization resulting from training and culture)
Music: Differences between musicians and non-musicians
Music: Differences resulting from acculturation
(A.S. Bregman, Auditory Scene Analysis, MIT Press 1990, pp. 1-45)
Music Related Terms
Pitch – Perceived frequency/fundamental tone (20Hz20kHz Range)
Melody – Pattern of tones identified by the intervals
between consecutive pitches
Contour – Shape of the melody without regard to
intervals
Loudness – Perceived intensity of sound (0dB to 120dB)
Timbre – Nature of a sound defined mostly by its
harmonic structure and time envelope
Rhythm – Repeated pattern of strong and weak sounds
Tempo – Rate of the rhythm
Melody Invariance
A melody can typically be recognized over changes in
pitch, loudness, timbre, tempo, spatial location, and
reverberations.
Contours are typically recalled better than actual melodies
(intervals) for unfamiliar tunes. (Massaro, Kallman, and
Kelly 1980).
(Daniel J. Levitin, Memory for Musical Attributes, in Music
Cognition and Computerized Sound, ed. P.R. Cook, MIT Press, 1999,
pp. 209-227)
Primitive Musical Perception
Distinguish between cognitive components
present at an early age and those resulting from
acculturation.
Infant: Grasp of musical structures
Adult: Develop cognitive strategies for applying
musical structures
(W. Jay Dowling, The Development of Music Perception and Cognition,
The Psychology of Music Academic Press, 1999, pp 603-625)
Summary
Innate perceptual organization separates sounds from different
sources. Grouping by pitch, contour, rhythm (phrasing), and timbre
are exhibited by infants.
Acculturation refines melody distinctions and its relationship to
harmonies and rhythms based on cultural scales and patterns.
Melodic memory is enhanced for melodies following note of a
known scale.
Auditory scene analysis operations apply broadly to all sounds
(speech, noise, music). Why some auditory streams become
pleasurable/stimulating/interesting (music), and others are simply
used to form a perception of reality is still not clear.
How many streams are there?
Tell Me Ma - Spectrogram in dB
8000
120
7000
100
6000
80
Hertz
5000
60
4000
3000
40
2000
20
1000
0
0
5
10
Seconds
15
Interesting Websites
Mind, Music, and Machine
http://www.nici.kun.nl/mmm/
Auditory Scene Analysis
http://www.psych.mcgill.ca/labs/auditory/introASA.html
Joe Wolfe’s Web Page
http://www.phys.unsw.edu.au/~jw/Joe.html