Machine Models of Human Musical Behaviour

Download Report

Transcript Machine Models of Human Musical Behaviour

Modelling
the perception and cognition
of musical structure
David Meredith
<[email protected]>
Centre for Cognition, Computation and Culture
Goldsmiths College, University of London
1
Algorithmic models of music cognition
Input
representation
(e.g., MIDI,
piano roll,
WAV file)
Theory
Algorithmic model
(formal rules,
computer program)
Structural description
(e.g., harmonic analysis,
metrical structure,
grouping structure)
Auxiliary
hypotheses
represented by
predicts
Real world
"Real-world"
manifestation of
music
(e.g., sound,
printed score,
dance)
represented by
represented by
Musical behaviour
(e.g., dancing,
expressive
performance,
composition,
improvisation )
causes
Sense organs
(ears, eyes)
Neural
encoding
Percept,
interpretation,
mental
representation
Brain
2
Longuet-Higgins’ model
metrical strength
Longuet-Higgins, H. C. (1976). The perception of melodies. Nature, 263(5579), 646-653.
Longuet-Higgins, H. C. (1987). The perception of melodies. In H. C. Longuet-Higgins (ed.), Mental Processes: Studies
in Cognitive Science, pp. 105-129. British Psychological Society/MIT Press, London/Cambridge, MA.
A flat, not G sharp
OUTPUT:
[[[24 C STC] [[-5 G STC] [0 G STC]]] [[1 AB] [-1 G TEN]]] [[[REST] [4 B STC]] [1 C TEN]]
3
Longuet-Higgins’ model of rhythm

Assumes listener initially assumes pure
binary metre


But willing to change mind at any metrical level
Evidence for change in metre:

Current metre implies syncopation



No note onset at beginning of next higher metrical unit
Current metre implies excessively large change in
tempo
Metre changed if


evidence for change and
other division does not imply syncopation or
excessive tempo change
4
Longuet-Higgins’ model of tonality


For each note, estimates value of sharpness: position
of pitch name on line of fifths
Theory of tonality consists of six rules


First ensures each note spelt so name is as close as possible
to local tonic on line of fifths
Other rules control how algorithm deals with chromatic
intervals and modulations

e.g., second rule states that if current key implies two
consecutive chromatic intervals, then key should be changed so
that both become diatonic
5
Longuet-Higgins’ model: Output

Section of cor anglais solo from Act III of Wagner’s Tristan und Isolde



Change from binary to ternary in first beat of fifth bar (triplets)
Grace note correctly identified in seventh bar
Agrees fully with original score in tonal and rhythmic indications


Wagner marked all triplets as staccato – fault with performance, not program!
98.21% notes spelt correctly (3508 errors) in a 195972 note corpus of
classical and baroque music

Cf. 99.44% spelt correctly (1100 errors) by Meredith’s PS13s1 algorithm

Meredith, D. (2006). The ps13 pitch spelling algorithm. Journal of New Music
Research, 35(2), pp. 121-159.
6
Lerdahl and Jackendoff ’s
Generative Theory of Tonal Music (GTTM)
Lerdahl, F. and Jackendoff, R. (1983). A Generative Theory of Tonal Music. MIT Press, Cambridge, MA.
Musical surface


Grouping structure
rules
Metrical structure
rules
Time-span
reduction
rules
Prolongational
reduction
rules
Grouping structure
Metrical structure
Time-span
reduction
Prolongational
reduction
WELL-FORMEDNESS RULES define CLASS of POSSIBLE structural
descriptions
PREFERENCE RULES used to find BEST structural
7
descriptions
Lerdahl and Jackendoff ’s
theory of grouping structure



Listener automatically segments music into structural
units of various sizes called groups
Grouping structure of a passage is way that it is
perceived to be segmented into groups
“Grouping can be viewed as the most basic
component of musical understanding” (Lerdahl and
Jackendoff, 1983, p.13)
8
Lerdahl and Jackendoff ’s
grouping well-formedness rules
GWFR 1 Any contiguous sequence of pitch-events, drum beats, or the like can
constitute a group, and only contiguous sequences can constitute a group.
GWFR 2 A piece constitutes a group.
GWFR 3 A group may contain smaller groups.
GWFR 4 If a group G1 contains part of a group G2, then it must contain all of
G2.
GWFR 5 If a group G1 contains a smaller group G2 then G1 must be
exhaustively partitioned into smaller groups.
9
The Gestalt principles of proximity and
similarity in vision and in music
10
Lerdahl and Jackendoff ’s
second grouping preference rule
GPR 2 (Proximity) Consider a sequence of four notes n1,
n2, n3, n4. All else being equal, the transition n2–n3 may
be heard as a group boundary if
a. (Slur/Rest) the interval of time from the end of n2 to the
beginning of n3 is greater than that from the end of n1 to
the beginning of n2 and that from the end of n3 to the
beginning of n4, or if
b. (Attack-Point) the interval of time between the attack
points of n2 and n3 is greater than that between the attack
points of n1 and n2 and that between the attack points of 11
n3 and n4.
Lerdahl and Jackendoff ’s
third preference rule
GPR 3 (Change) Consider a sequence of four notes n1, n2, n3, n4.
All else being equal, the transition n2–n3 may be heard as a
group boundary if
a. (Register) the transition n2–n3 involves a greater intervallic
distance than both n1–n2 and n3–n4, or if
b. (Dynamics) the transition n2–n3 involves a change in dynamics
and n1–n2 and n3–n4 do not, or if
c. (Articulation) the transition n2–n3 involves a change in
articulation and n1–n2 and n3–n4 do not, or if
d. (Length) n2 and n3 are of different lengths and both pairs n1,
n2 and n3, n4 do not differ in length.”
(One might add further cases to deal with such things as change in 12
timbre or instrumentation.)
Temperley and Sleator’s Melisma music analyser
Temperley, D. (2001). The Cognition of Basic Musical Structures. MIT Press, Cambridge, MA.
Meredith, D. (2002). Review of David Temperley’s The Cognition of Basic Musical Structures
(Cambridge, MA: MIT Press, 2001). Musicae Scientiae, 6(2), pp. 287-302.
Notes
Meter
(prechord mode)
Notes
Beats (tactus and below)
Harmony
(prechord mode)
Roman numeral
harmonic analysis
Key
Notes with streams
Beats
Notes
Beats
Phrases
Streamer
Grouper
TPCNotes
Beats
Chords
Notes
Beats (tactus and below)
Chord change time points
Harmony
Meter
Notes
Beats
13
Temperley’s theory of contrapuntal
structure: Input representation
14
Temperley’s contrapuntal wellformedness rules (CWFRs)
CWFR
the
CWFR
CWFR
CWFR
1 A stream must consist of a set of temporally contiguous squares on
plane.
2 A stream may be only one square wide in the pitch dimension.
15
3 Streams may not cross in pitch.
4 Each note must be entirely included in a single stream.
Temperley’s contrapuntal preference
rules (CPRs)
CPR 1 (Pitch Proximity Rule) Prefer to avoid large leaps within streams.
CPR 2 (New Stream Rule) Prefer to minimize the number of streams.
CPR 3 (White Square Rule) Prefer to minimize the number of white squares
in streams.
CPR 4 (Collision Rule) Prefer to avoid cases where a single square is
16
included in more than one stream.
Using Temperley’s theory to model listening,
composition, performance and style







Temperley and Sleator’s programs scan the music from left to
right, keeping note of the analyses that best satisfy the
preference rules so far at each point.
Ambiguity: Two or more best analyses at a given point in the
music.
Revision: The best analysis at some point in the music does not
form part of the best analysis at some later point.
Expectation: The most expected events are those that will lead
to an analysis that best satisfies the preference rules.
Style: A passage is in the style defined by a set of preference
rules if the analysis that best satisfies the preference rules
achieves a score that is not too high (boring) and not too low
(incomprehensible).
Composition: Choices guided by goal to produce piece that
satisfies preference rules to just the right extent.
Performance: Temporal and dynamic expression geared
towards conveying structure in accordance with analysis that
best satisfies the preference rules.
17
Summing up



We can attempt to model the perception and
cognition of musical structure by constructing
algorithms that take representations of musical
passages as input and generate structural
descriptions of those passages as output
We can evaluate such algorithms by comparing
their output with expert human analyses and
authoritative scores
Can express a theory of musical structure as a
preference rule system consisting of


Well-formedness rules that define the class of legal
structural descriptions
Preference rules: the legal structural descriptions that
best satisfy the preference rules are predicted to be the
ones that listeners are most likely to hear
18