Sup1 - University of Kentucky

Download Report

Transcript Sup1 - University of Kentucky

Audio Scene Analysis and Music
Cognitive Elements of Music Listening
Kevin D. Donohue
Databeam Professor
Electrical and Computer Engineering
University of Kentucky
What is Music?
1 a : the science or art of ordering tones or
sounds in succession, in combination, and
in temporal relationships to produce a
composition having unity and continuity b :
vocal, instrumental, or mechanical sounds
having rhythm, melody, or harmony
Merrian-Webster Online Dictionary:
http://www.m-w.com/dictionary/music
Auditory Scene: Input
 Sensory organs (ears) separate acoustic energy into
frequency bands and convert band energy into neural
firings
 The auditory cortex receives the neural responses and
abstracts an auditory scene.
1
2
0
0.0
5
3
0.1
Time
4
http://hyperphysics.phy-astr.gsu.edu/hbase/sound/hearcon.html
Frequency
Auditory Scene: Perception
 Perception derives a useful representation of reality
from sensory input.
 Auditory Stream refers to a perceptual unit associated
with a single happening (A.S. Bregman, 1990) .
Acoustic to
Neural
Conversion
Organize into
Auditory
Streams
Representation
of Reality
Auditory Stream Experiment
Bergman & Campbell (1971)
 Streams tend to form by grouping notes close in time and
frequency (similarity and proximity).
http://www.psych.mcgill.ca/labs/auditory/demo3.html
http://www.psych.mcgill.ca/labs/auditory/demo2.html
Circularity in Pitch Judgement
 Shepard’s Scale (1964)
(Auditory Demonstrations CD, from the Acoustical Society of America)
Perceptual Organization
Organization properties:
 Belongingness – a sensory element belongs to an
organization (or stream) of which is a part.
 Exclusive allocation – a sensory element cannot
belong to more than one organization at a time.
 Bregman & Rudnicky (1975)
Perceptual Organization
Organization properties:
 Closure – perceived continuity, a tendency to
close strong perceptual forms, response to
missing evidence.
Sequential and Spectral Integration
Sequential Integration
 Grouping sensory elements over time or events at
different times considered to be from the same source.
Melody, rhythm
Spectral Integration
 Fusing simultaneous sensory elements over frequency
into one
Timbre, harmony
Timbre and Spectral Integration
 The time envelope and harmonic structure give rise the
timbre of the sound.
0.5
Amplitude
0
dB
-20
-40
-60
0
2000
4000
6000
Hertz
8000
10000
-0.5
12000
dB
-20
-40
0
2000
4000
6000
Hertz
8000
10000
0.2
0.4
0.6
Seconds
0.8
1
0
0.2
0.4
0.6
Seconds
0.8
1
0
0.2
0.4
0.6
Seconds
0.8
1
0.5
0
-0.5
-1
12000
0
Amplitude
1
dB
-20
-40
-60
0
1
Amplitude
0
-60
0
0
2000
4000
6000
Hertz
8000
10000
12000
0.5
0
-0.5
-1
Timbre and Spectral Integration
Simultaneous tones grouped by timbre
2 Notes (F and A)
5000
5000
4000
4000
3000
3000
Hertz
Hertz
Same Note (A)
2000
2000
1000
1000
0
0.1
0.2 0.3
Seconds
0.4
0
0.1
0.2 0.3
Seconds
0.4
Auditory Scene Organization
 Primitive Stream Segregation
 Inherent constraints in auditory scene analysis (perceptual organization
demonstrated by infants/children)
 Music: Organization of musical sensory units
 Schema-based segregation
 Learned constraints in auditory scene analysis (differences in perceptual
organization resulting from training and culture)
 Music: Differences between musicians and non-musicians
 Music: Differences resulting from acculturation
(A.S. Bregman, Auditory Scene Analysis, MIT Press 1990, pp. 1-45)
Music Related Terms
 Pitch – Perceived frequency/fundamental tone (20Hz20kHz Range)
 Melody – Pattern of tones identified by the intervals
between consecutive pitches
 Contour – Shape of the melody without regard to
intervals
 Loudness – Perceived intensity of sound (0dB to 120dB)
 Timbre – Nature of a sound defined mostly by its
harmonic structure and time envelope
 Rhythm – Repeated pattern of strong and weak sounds
 Tempo – Rate of the rhythm
Melody Invariance
 A melody can typically be recognized over changes in
pitch, loudness, timbre, tempo, spatial location, and
reverberations.
 Contours are typically recalled better than actual melodies
(intervals) for unfamiliar tunes. (Massaro, Kallman, and
Kelly 1980).
(Daniel J. Levitin, Memory for Musical Attributes, in Music
Cognition and Computerized Sound, ed. P.R. Cook, MIT Press, 1999,
pp. 209-227)
Primitive Musical Perception
 Distinguish between cognitive components present
at an early age and those resulting from
acculturation.
 Infant: Grasp of musical structures
 Adult: Develop cognitive strategies for applying
musical structures
(W. Jay Dowling, The Development of Music Perception and Cognition,
The Psychology of Music Academic Press, 1999, pp 603-625)
Summary
 Innate organization for separating sounds from different sources.
Grouping by pitch, contour, rhythm (phrasing), and timbre are
exhibited by infants.
 Acculturation refines melody distinctions and its relationship to
harmonies and rhythms based on cultural scales and patterns.
 Melodic memory is enhanced for melodies following note of a
known scale.
 Auditory scene analysis operations apply broadly to all sounds
(speech, noise, music). Why some auditory streams become
pleasurable/stimulating/interesting (music), and others are simply
used to form a perception of reality is still not clear.
How many streams are there?
Tell Me Ma - Spectrogram in dB
8000
120
7000
100
6000
80
Hertz
5000
60
4000
3000
40
2000
20
1000
0
0
5
10
Seconds
15
Interesting Websites
 Mind, Music, and Machine
http://www.nici.kun.nl/mmm/
 Auditory Scene Analysis
http://www.psych.mcgill.ca/labs/auditory/introASA.html
 Joe Wolfe’s Web Page
http://www.phys.unsw.edu.au/~jw/Joe.html