Sup1 - University of Kentucky
Download
Report
Transcript Sup1 - University of Kentucky
Audio Scene Analysis and Music
Cognitive Elements of Music Listening
Kevin D. Donohue
Databeam Professor
Electrical and Computer Engineering
University of Kentucky
What is Music?
1 a : the science or art of ordering tones or
sounds in succession, in combination, and
in temporal relationships to produce a
composition having unity and continuity b :
vocal, instrumental, or mechanical sounds
having rhythm, melody, or harmony
Merrian-Webster Online Dictionary:
http://www.m-w.com/dictionary/music
Auditory Scene: Input
Sensory organs (ears) separate acoustic energy into
frequency bands and convert band energy into neural
firings
The auditory cortex receives the neural responses and
abstracts an auditory scene.
1
2
0
0.0
5
3
0.1
Time
4
http://hyperphysics.phy-astr.gsu.edu/hbase/sound/hearcon.html
Frequency
Auditory Scene: Perception
Perception derives a useful representation of reality
from sensory input.
Auditory Stream refers to a perceptual unit associated
with a single happening (A.S. Bregman, 1990) .
Acoustic to
Neural
Conversion
Organize into
Auditory
Streams
Representation
of Reality
Auditory Stream Experiment
Bergman & Campbell (1971)
Streams tend to form by grouping notes close in time and
frequency (similarity and proximity).
http://www.psych.mcgill.ca/labs/auditory/demo3.html
http://www.psych.mcgill.ca/labs/auditory/demo2.html
Circularity in Pitch Judgement
Shepard’s Scale (1964)
(Auditory Demonstrations CD, from the Acoustical Society of America)
Perceptual Organization
Organization properties:
Belongingness – a sensory element belongs to an
organization (or stream) of which is a part.
Exclusive allocation – a sensory element cannot
belong to more than one organization at a time.
Bregman & Rudnicky (1975)
Perceptual Organization
Organization properties:
Closure – perceived continuity, a tendency to
close strong perceptual forms, response to
missing evidence.
Sequential and Spectral Integration
Sequential Integration
Grouping sensory elements over time or events at
different times considered to be from the same source.
Melody, rhythm
Spectral Integration
Fusing simultaneous sensory elements over frequency
into one
Timbre, harmony
Timbre and Spectral Integration
The time envelope and harmonic structure give rise the
timbre of the sound.
0.5
Amplitude
0
dB
-20
-40
-60
0
2000
4000
6000
Hertz
8000
10000
-0.5
12000
dB
-20
-40
0
2000
4000
6000
Hertz
8000
10000
0.2
0.4
0.6
Seconds
0.8
1
0
0.2
0.4
0.6
Seconds
0.8
1
0
0.2
0.4
0.6
Seconds
0.8
1
0.5
0
-0.5
-1
12000
0
Amplitude
1
dB
-20
-40
-60
0
1
Amplitude
0
-60
0
0
2000
4000
6000
Hertz
8000
10000
12000
0.5
0
-0.5
-1
Timbre and Spectral Integration
Simultaneous tones grouped by timbre
2 Notes (F and A)
5000
5000
4000
4000
3000
3000
Hertz
Hertz
Same Note (A)
2000
2000
1000
1000
0
0.1
0.2 0.3
Seconds
0.4
0
0.1
0.2 0.3
Seconds
0.4
Auditory Scene Organization
Primitive Stream Segregation
Inherent constraints in auditory scene analysis (perceptual organization
demonstrated by infants/children)
Music: Organization of musical sensory units
Schema-based segregation
Learned constraints in auditory scene analysis (differences in perceptual
organization resulting from training and culture)
Music: Differences between musicians and non-musicians
Music: Differences resulting from acculturation
(A.S. Bregman, Auditory Scene Analysis, MIT Press 1990, pp. 1-45)
Music Related Terms
Pitch – Perceived frequency/fundamental tone (20Hz20kHz Range)
Melody – Pattern of tones identified by the intervals
between consecutive pitches
Contour – Shape of the melody without regard to
intervals
Loudness – Perceived intensity of sound (0dB to 120dB)
Timbre – Nature of a sound defined mostly by its
harmonic structure and time envelope
Rhythm – Repeated pattern of strong and weak sounds
Tempo – Rate of the rhythm
Melody Invariance
A melody can typically be recognized over changes in
pitch, loudness, timbre, tempo, spatial location, and
reverberations.
Contours are typically recalled better than actual melodies
(intervals) for unfamiliar tunes. (Massaro, Kallman, and
Kelly 1980).
(Daniel J. Levitin, Memory for Musical Attributes, in Music
Cognition and Computerized Sound, ed. P.R. Cook, MIT Press, 1999,
pp. 209-227)
Primitive Musical Perception
Distinguish between cognitive components present
at an early age and those resulting from
acculturation.
Infant: Grasp of musical structures
Adult: Develop cognitive strategies for applying
musical structures
(W. Jay Dowling, The Development of Music Perception and Cognition,
The Psychology of Music Academic Press, 1999, pp 603-625)
Summary
Innate organization for separating sounds from different sources.
Grouping by pitch, contour, rhythm (phrasing), and timbre are
exhibited by infants.
Acculturation refines melody distinctions and its relationship to
harmonies and rhythms based on cultural scales and patterns.
Melodic memory is enhanced for melodies following note of a
known scale.
Auditory scene analysis operations apply broadly to all sounds
(speech, noise, music). Why some auditory streams become
pleasurable/stimulating/interesting (music), and others are simply
used to form a perception of reality is still not clear.
How many streams are there?
Tell Me Ma - Spectrogram in dB
8000
120
7000
100
6000
80
Hertz
5000
60
4000
3000
40
2000
20
1000
0
0
5
10
Seconds
15
Interesting Websites
Mind, Music, and Machine
http://www.nici.kun.nl/mmm/
Auditory Scene Analysis
http://www.psych.mcgill.ca/labs/auditory/introASA.html
Joe Wolfe’s Web Page
http://www.phys.unsw.edu.au/~jw/Joe.html