Music Representation - Indiana University Computer Science

Download Report

Transcript Music Representation - Indiana University Computer Science

Representation of Musical
Information
Donald Byrd
School of Informatics & Jacobs School of Music
Indiana University
Updated 17 Oct. 2007
Copyright © 2003-07, Donald Byrd
1
Review: Classification: Logician General’s Warning
• Classification is dangerous to your understanding
–
–
–
–
Almost everything in the real world is messy, ill-defined
Absolute correlations between characteristics are rare
Example: some mammals lay eggs; some are “naked”
Example: was the first real piano Cristofori's (ca. 1700),
Broadwood's (ca. 1790), or another?
• People say “an X has characteristics A, B, C…”
• Usually mean “an X has A, & usually B, C…”
• Leads to:
– People who know better claiming absolute correlations
– “Is it this or that or that?” questions that don’t have an answer
– Don changing his mind
• But lack of classification is also dangerous to your
understanding!
3 Oct. 07
2
Basic Representations of Music & Audio (1)
Digital Audio
Audio (e.g., CD, MP3):
like speech
Time-stamped
Time-stamped
Events Events
(e.g., MIDI file): like
unformatted text
Musiclike
Notation
Music Notation:
text with complex
formatting
27 Jan.
3
Basic Representations of Music & Audio (2)
Audio
Time-stamped Events
Music Notation
Common examples
CD, MP3 file
Standard MIDI File
Sheet music
Unit
Sample
Event
Note, clef, lyric, etc.
Explicit structure
none
little (partial voicing
information)
much (complete
voicing information)
Avg. rel. storage
2000
1
10
Convert to left
-
easy
OK job: easy
Convert to right
1 note: pretty easy
OK job: fairly hard
other: hard or very hard
-
Ideal for
music
bird/animal sounds
sound effects
speech
music
music
27 Jan.
4
Basic Representations of Music & Audio (3)
Audio: no (explicit) structure
Events/MIDI: simple structure
Notation: very complex structure
2 Oct. 07
5
Dimensions of Music Representations &
Encodings (1)
• Waveform
• Csound
Expressive
Completeness
• CMN
• MusicXML
• Notelist
• MIDI (SMF)
Structural Generality
(After Wiggins et al (1993). A Framework for the Evaluation of Music
Representation Systems.)
rev. 20 Feb. 07
6
Dimensions of Music Representations &
Encodings (2)
• Expressive completeness
– How much of all possible music can the representation
express?
– Includes synthesized as well as acoustic sounds!
– Waveform (=audio) is truly “complete”
– Exception, sort of: conceptual music
• E.g., Tom Johnson: Celestial Music for Imaginary Trumpets
(notes on 100 ledger lines), Cage: 4’ 33” (of silence), etc.
• Structural generality
– How much of structure in any piece of music can the
representation express?
– Music notation with repeat signs, etc. still expresses
nowhere near all possible structure
rev. 31 Jan. 07
7
Representation vs. Encoding
• Representation: what information is conveyed?
– More abstract (conceptual)
– Basic = general type of info; specific = exact type
• Encoding: how is the information conveyed?
– More concrete: in computer (“bits”)…or on paper
(“atoms”)!
• One representation can have many encodings
– “Atoms” example: music notation in printed or Braille
form
– “Bits” example: any kind of text in ASCII vs. Unicode
– “Bits” example: formatted text in HTML, RTF, .doc
30 Jan. 06
8
Representation Example: a Bit of Mozart
The first few measures of Variation 8 of the “Twinkle” Variations
27 Jan.
9
In Notation Form: Nightingale Notelist
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
%%Notelist-V2 file='MozartRepresentationEx' partstaves=2 0 startmeas=193
C stf=1 type=3
C stf=2 type=10
K stf=1 KS=3 b
K stf=2 KS=3 b
T stf=1 num=2 denom=4
T stf=2 num=2 denom=4
A v=1 npt=1 stf=1 S1 'Variation 8'
D stf=1 dType=5
N t=0 v=1 npt=1 stf=1 dur=5 dots=0 nn=72 acc=0 eAcc=3 pDur=228 vel=55 ...... appear=1
R t=0 v=2 npt=1 stf=2 dur=-1 dots=0 ...... appear=1
N t=240 v=1 npt=1 stf=1 dur=5 dots=0 nn=74 acc=0 eAcc=3 pDur=228 vel=55 ...... appear=1
N t=480 v=1 npt=1 stf=1 dur=5 dots=0 nn=75 acc=0 eAcc=2 pDur=228 vel=55 ...... appear=1
N t=720 v=1 npt=1 stf=1 dur=5 dots=0 nn=77 acc=0 eAcc=3 pDur=228 vel=55 ...... appear=1
/ t=960 type=1
N t=960 v=1 npt=1 stf=1 dur=4 dots=0 nn=79 acc=0 eAcc=3 pDur=456 vel=55 ...... appear=1
(etc. File size: 1862 bytes)
27 Jan.
10
An Event Form: Standard MIDI File (file dump)
•
•
•
•
•
•
•
•
•
•
•
•
•
0:
16:
32:
48:
64:
80:
96:
112:
128:
144:
160:
176:
192:
4D54 6864
726B 0000
0402 0218
0055 00FF
6480 4840
3881 6480
904F 3883
3883 4880
804D 400D
FF03 0550
4140 0C90
6480 4440
00 .
0000
0014
0896
0305
0C90
4B40
4880
4F40
FF2F
6961
4330
0C90
0006
00FF
34FF
5069
4A38
0C90
4F40
1890
004D
6E6F
8164
4647
0001
5103
2F00
616E
8164
4D38
1890
4D38
5472
8F00
8043
8164
0003
0B70
4D54
6F00
804A
8164
4F38
8330
6B00
9041
400C
8046
01E0
C000
726B
9048
400C
804D
8360
8050
0000
2B81
9044
4001
4D54
FF58
0000
3881
904B
400C
9050
4018
3200
6480
3181
FF2F
5 Feb.
MThd.........‡MT
rk......Q..p¿..X
.....ñ4./.MTrk..
.U....Piano.êH8Å
dÄH@.êJ8ÅdÄJ@.êK
8ÅdÄK@.êM8ÅdÄM@.
êO8ÉHÄO@.êO8É`êP
8ÉHÄO@.êM8É0ÄP@.
ÄM@../.MTrk...2.
...Pianoè.êA+ÅdÄ
A@.êC0ÅdÄC@.êD1Å
dÄD@.êFGÅdÄF@../
11
An Event Form: Standard MIDI File (interpreted)
•
Header format=1 ntrks=3 division=480
•
•
•
•
•
Track #1 start
t=0 Tempo microsec/MIDI-qtr=749760
t=0 Time sig=2/4 MIDI-clocks/click=24 32nd-notes/24-MIDI-clocks=8
t=2868 Meta event, end of track
Track end
•
•
•
•
•
•
•
•
Track #2 start
t=0 Meta Text, type=0x03 (Sequence/Track Name) leng=5
Text = <Piano>
t=0 NOn ch=1 num=72 vel=56
t=228 NOff ch=1 num=72 vel=64
t=240 NOn ch=1 num=74 vel=56
t=468 NOff ch=1 num=74 vel=64
(etc. File size: 193 bytes)
27 Jan.
12
Basic and Specific Representations vs. Encodings
Basic and Specific Representations (above the line)
Audio
Waveform
Time-stamped Events
Time-stamped MIDI
SMF
.WAV
Red Book (CD)
Csound score
Music Notation
Gamelan not.
Time-stamped expMIDI
Csound score
Notelist
expMIDI File
Tablature
CMN
Mensural not.
MusicXML
Finale
ETF
Encodings (below the line)
rev. 15 Feb.
13
Time Domain & Frequency Domain (1)
• Time domain involves waveforms
• Frequency domain involves spectra
• Fourier’s Theorem
– Any periodic signal can be described exactly as the sum
of sine waves at integral multiples of its fundamental
frequency (Fourier analysis)
– Fourier Transform takes time domain to frequency
– Inverse Fourier Transform takes freq. domain to time
– Fourier synthesis is usual kind of additive synthesis
• Definite-pitched sounds are (more-or-less)
periodic
8 Feb. 07
14
Time Domain & Frequency Domain (2)
• Sine waves in trigonometry
• Phase in degrees (0 to 360)
• “Simple” example of Fourier synthesis: perfect
square wave = an infinite number of odd
harmonics
• …but only if they’re all in phase
• Demo: Fourier applet
– http://www.falstad.com/fourier/
31 Jan. 07
15
Time Domain & Frequency Domain (3)
• But real-world sounds are almost never periodic!
• True, but definite-pitched notes are “close
enough” for Fourier analysis to be useful
– In reality, usually a series of short-term Fourier
Transforms (STFTs)
– Look at spectra of individual notes (from Iowa samples,
EBU arpeggios)
• This is mathematics & physics; perception
(psychoacoustics) is different, subtle
– We perceive musical sound in both domains—
sometimes more one, sometimes more the other
– Phase affects waveform, but maybe not perception
25 Feb. 07
16
Time Domain & Frequency Domain (4)
• Summary
Periodic waveforms:
clearcut definite pitch
With sharp corners: much
high-freq. energy
square wave; loud trumpet
Without sharp corners:
little high-freq. energy
sine wave; low, soft flute
Almost-periodic
piano
waveforms: definite pitch
Aperiodic waveforms:
noisy, no definite pitch
cymbal, bass drum
25 Feb. 07
17
Real-World Musical Sounds
• The “Attack/Sustain/Release” model for notes
– Attack, Sustain, Release modified from recordings
• Used in the Kurzweil 250 (1984), etc.
–
–
–
–
Original version had only 2 MB for all samples
Piano had diff. samples for 2 loudness levels
…and diff. sound for every 4-6 semitones
1-2 sec. per sample for A+S+R
• How good did the K250 really sound?
– COUNTDOWN, by Christopher Yavelow
– “An opera for the nuclear age”
• “the ‘orchestral accompaniment’ is in reality a Kurzweil-250
digital sampler, synchronized to the baton of the conductor…”
– http://www.yavelow.com/docs/countdown.html
12 Feb. 07
18
Real-World Musical Sounds
• Nowadays, can afford “unlimited” sustain
• …but also need diff. sounds for many (8?) diff.
loudness levels (multisampling)
– “All Together Now”, Electronic Musician, Jan. 2007
• …and diff. sound for every semitone or two
• W/ unlimited sustain, takes gigabytes just for
piano!
19 Feb. 07
19
Scholars (and others) Beware!
• Plausible (at the time) assumptions
– Stomach ulcers can’t be caused by organisms (20th
century)
– Men have more teeth than women (ancient)
• What you expect & what you see
– Sponges, dinosaurs, etc.: discuss later
• What you expect & what you hear
– Don & the Kurzweil 250 flute sound
– Don, a famous musician, & K250 handclaps
– Huron on what he “knew” & learned
• R. Moog at Kurzweil & piano touch
rev. 20 Sep. 2006
20
Uncompressed Audio Files are Big
• 1 byte = 8 bits (nearly always)
• How much data on a CD?
– CD audio is 44,100 samples/channel/sec. * 2
bytes/sample * 2 channels = 176,400 bytes/sec., or 10.5
MByte/min.
– CD can store up to 74 min. (or 80) of music
– 10.5 MByte/min. * 74 min. = 777 MBytes
– Actually more: also index, error correction data, etc.
22 Sep. 2006
21
Compressed Audio: Lossless & Lossy
• Don’t confuse data compression and dynamic-range
compression (a.k.a. audio level compression, limiting)
• Codec = COmpressor/DECompressor
• Lossless compression
– Standard methods (LZW: .zip, etc.) don’t do much for audio
– Audio specific methods
• MLP used for DVD-Audio
• Apple & Microsoft Lossless
• Lossy compression
– Depends on psychoacoustics (“perceptual coding”)
16 Feb. 06
22
Lossless Compression of Text
• Lossless compression of a
children’s nursery rhyme
Pease porridge hot,
Pease porridge cold,
Pease porridge in the pot,
Nine days old;
Some like it hot,
Some like it cold,
Some like it in the pot,
Nine days old.
•
Diagram from Witten, Moffat,
& Bell, Managing Gigabytes,
2nd ed.
12 Feb. 07
23
Specs for Some Common Audio Formats
Format
Encoding Type
Details
Fidelity
“Red Book” (CD)
Unco mpressed,
li nea r
44.1KHz, 16 bits/ sample,
stereo
Very h igh
Bandwidth
(Kbps)
ca. 1400
Early game aud io
Unco mpressed,
li nea r
22.05KHz, 8 bits/ sample,
mono
Low
176
MLP, Apple
Lossle ss
Compression, etc.
MP2 (Variations1)
Lossle ss comp.
compression c a. 2:1
Very h igh
ca. 700
Lossy comp.
compression c a. 3:1
High
ca. 400
MP3 (Variations2),
AAC, WMA
Lossy comp.
compression c a. 7:1 to
ove r 10:1
High to
very h igh
ca. 128-192
compression more than
20:1
Medium
ca. 28-64
AAC (Variations2), Lossy comp.
WMA
13 Feb. 06
24
Psychoacoustics & Perceptual Coding
• Pohlmann, Ken (2005). Principles of Digital
Audio, 5th ed., Chapter 10: Perceptual Coding
• Rationale: much better data compression
• Based on physiology of ear and critical bands
– Not fixed frequency: any sound creates one or more
critical bands
• Masking
– Depends on relative loudness & frequency
– Noise is much better than pitched sounds
• Threshhold of hearing
– Depends greatly on frequency
22 Sep. 2006
25
Compressed Audio: Lossy Compression (1)
• General method
1. Divide signal into sub-bands by frequency
2. Take advantage of (e.g.):
• Masking (“shadows”), via amplitude within critical bands
• Threshhold of audibility (varies w/ frequency)
• Redundancy among channels
• MPEG-1 layers I thru III (MP1, 2, 3), AAC get better &
better compression via more & more complex techniques
– “There is probably no limit to the complexity of psychoacoustics.”
--Pohlmann, 5th ed.
– However, there probably is an “asymptotic” limit to compression!
• Implemented in hardware or software codecs
22 Feb. 06
26
Compressed Audio: Lossy Compression (2)
• Evaluation via critical listening is essential
– ITU 5-point scale
• 5 = imperceptible, 4 = perceptible but not annoying, 3 = slightly
annoying, 2 = annoying, 1 = very annoying
– Careful tests: often double-blind, triple-stimulus, hidden reference
• E.g., ISO qualifying AAC with 31 expert listeners (cf. Hall article)
– Test materials chosen to stress codecs
• Common useful tests: glockenspiel, castanets, triangle, harpsichord,
speech, trumpet
• Soulodre’s worst-case tracks: bass clarinet arpeggio, bowed double
bass, harpsichord arpeggio, pitch pipe, muted trumpet
• References: Pohlmann Principles of Digital Audio (on reserve)
17 Feb. 06
27
Hybrid Representation & Compression (1)
• Events (with “predefined” timbre) take very little
space
– Mozart fragment AIFF (CD-quality audio): 794,166
bytes
– Mozart fragment MIDI file: 193 bytes
– Timbre takes same amount of space, regardless of
music length!
– Problem: don’t have exact timbre for any performance
• CSound, CMusic, etc. have MIDI-like score and
software synthesis def. of orchestra
17 Feb. 07
28
Hybrid Representation & Compression (2)
• Mike Hawley’s approach: find structure in audio;
create events & timbre definition
– Hawley, Michael J. (1990). The Personal Orchestra, or,
Audio Data Compression by 10000:1. Usenix
Computing Systems Journal 3(2), pp. 289—329.
• Could hybrid event/audio representation lead to
his “audio data compression by a factor of
10,000”?
• Maybe, but no time soon!
17 Feb. 07
29
An Event Form: Standard MIDI File (file dump)
•
•
•
•
•
•
•
•
•
•
•
•
•
0:
16:
32:
48:
64:
80:
96:
112:
128:
144:
160:
176:
192:
4D54 6864
726B 0000
0402 0218
0055 00FF
6480 4840
3881 6480
904F 3883
3883 4880
804D 400D
FF03 0550
4140 0C90
6480 4440
00 .
0000
0014
0896
0305
0C90
4B40
4880
4F40
FF2F
6961
4330
0C90
0006
00FF
34FF
5069
4A38
0C90
4F40
1890
004D
6E6F
8164
4647
0001
5103
2F00
616E
8164
4D38
1890
4D38
5472
8F00
8043
8164
0003
0B70
4D54
6F00
804A
8164
4F38
8330
6B00
9041
400C
8046
01E0
C000
726B
9048
400C
804D
8360
8050
0000
2B81
9044
4001
4D54
FF58
0000
3881
904B
400C
9050
4018
3200
6480
3181
FF2F
5 Feb.
MThd.........‡MT
rk......Q..p¿..X
.....ñ4./.MTrk..
.U....Piano.êH8Å
dÄH@.êJ8ÅdÄJ@.êK
8ÅdÄK@.êM8ÅdÄM@.
êO8ÉHÄO@.êO8É`êP
8ÉHÄO@.êM8É0ÄP@.
ÄM@../.MTrk...2.
...Pianoè.êA+ÅdÄ
A@.êC0ÅdÄC@.êD1Å
dÄD@.êFGÅdÄF@../
30
An Event Form: Standard MIDI File (interpreted)
•
Header format=1 ntrks=3 division=480
•
•
•
•
•
Track #1 start
t=0 Tempo microsec/MIDI-qtr=749760
t=0 Time sig=2/4 MIDI-clocks/click=24 32nd-notes/24-MIDI-clocks=8
t=2868 Meta event, end of track
Track end
•
•
•
•
•
•
•
•
Track #2 start
t=0 Meta Text, type=0x03 (Sequence/Track Name) leng=5
Text = <Piano>
t=0 NOn ch=1 num=72 vel=56
t=228 NOff ch=1 num=72 vel=64
t=240 NOn ch=1 num=74 vel=56
t=468 NOff ch=1 num=74 vel=64
(etc. File size: 193 bytes)
27 Jan.
31
MIDI (Musical Instrument Digital Interface) (1)
• Invented in early 1980’s
– Dawn of personal computers
– Designed as simple (& cheap to implement) real-time
protocol for communication between synthesizers
– Low bandwidth: 31.25 Kbps
• Top bit of byte: 1 = status, 0 = data
– Numbers usually 7 bits (range 0-127); sometimes 14 or even 21
• Message types
– Channel: Channel Voice, Channel Mode
– System: System Common, System Real-Time, System Exclusive
5 Feb. 06
32
MIDI (2)
• Important standard Events are mostly Channel Voice msgs
– Note On: channel (1-16), note number (0-127), on velocity
– Note Off: channel, note number, off velocity
• Can change “voice” (really patch!) any time with Program
Change msg
• A way around the 16-channel limit: cables
– may or may not correspond to a physical cable
– each cable supports 16 channels independent of others
– Systems with 4 (=64 channels) or 8 cables (=128) are common
• MIDI Monitor allows watching MIDI in real time
– Freeware and open source!
5 Feb. 06
33
MIDI Sequencers
• Record, edit, & play SMFs (Standard MIDI Files)
• Standard views
– Piano roll
• often with velocity, controllers, etc., in parallel
– Event list
– Other: Mixer, “Music notation”, etc.
– Standard editing
• Adding digital audio
– Personal computers & software-development tools have gotten
more & more powerful
– => "digital audio sequencers”: audio & MIDI (stored in hybrid
encodings)
• Making results more musical: “Humanize”
– Timing, etc. isn’t mechanical—but not really musical (yet)
8 Feb. 06
34
Is a MIDI File a “Score” or a Performance?
• MIDI files are often used to encode music from notation
• …but also often used to describe performances!
• What’s the difference?
– Timing
– Dynamics
– Realizing ornaments, etc.
• For scores, MIDI files are very limited
– Max. 16 explicit voices, no spelling info, no slurs, etc.
• …though not as badly as many assume
– Can include time sig., key sig., text/lyrics, etc.
• Cf. “Dimensions of Music Representations & Encodings”
graph
15 Feb. 07
35
Another Warning: Terminology (1)
• A perilous question: “How many voices does this
synthesizer have?”
• Syllogism
– Careless and incorrect use of technical terms is
dangerous to your learning much
– Experts very often use technical terms carelessly
– Beginners often use technical terms incorrectly
– Therefore, your learning much is in danger
• Somewhat exaggerated, but only somewhat
5 Feb. 06
36
Another Warning: Terminology (2)
• Not-too-serious case: “system”
– Confusion because both standard (common) computer
term & standard (rare but useful) music term
• Serious case: patch, program, timbre, or voice
– Vocabulary def.: Patch: referring to event-based systems such as MIDI
and most synthesizers (particularly hardware synthesizers), a setting that
produces a specific timbre, perhaps with additional features. The terms
"voice", "timbre", and "program" are all used for the identical concept;
all have the potential to cause substantial confusion and should be
avoided as much as possible
– “Patch” is the only unambiguous term of the four
– …but the official MIDI specification (& almost everything else)
talks about “voices” (as in “Channel Voice messages control the
instrument’s 16 voices”)
– …and to change the “voice”, you use a “program change”!
6 Feb. 06
37
Another Warning: Terminology (3)
• Some terminology is just plain difficult
• Example: “Representation” vs. “Encoding”
– Distinction: 1st is more abstract, 2nd more concrete
– …but what does that mean?
– Explaining milk to a blind person: “a white liquid...”
• Don’s precision involves being very careful with
terminology, difficult or not
– Vocabulary is important source
– Cf. other sources
– Contributions are welcome
6 Feb. 06
38
Selfridge-Field on Describing Musical Information
•
Cf. Selfridge-Field, E. (1997). Describing Musical Information.
• What is Music Representation? (informal use of term!)
– Codes in Common Use: solfegge (pitch only), CMN, etc.
– “Representations” for Computer Application: “total”, MIDI
• Parameters of Musical Information
– Contexts: sound, notation/graphical, analytic, semantic; gestural?
– Concentrates on 1st three
• Processing Order: horizontal or vertical priority
• Code Categories
–
–
–
–
–
Sound Related Codes: MIDI and other
Music Notation Codes: DARMS, SCORE, Notelist, Braille!?, etc.
Musical Data for Analysis: Plaine and Easie, Kern, MuseData, etc.
Representations of Musical Patterns and Process
Interchange Codes: SMDL, NIFF, etc.; almost obsolete!
30 Jan. 06
39
Review: The Four Parameters of Notes
• Four basic parameters of a definite-pitched musical note
1. pitch: how high or low the sound is: perceptual analog of
frequency
2. duration: how long the note lasts
3. loudness: perceptual analog of amplitude
4. timbre or tone quality
• Above is decreasing order of importance for most Western
music
• …and decreasing order of explicitness in CMN!
40
Review: How to Read Music Without Really Trying
•
CMN shows at least six aspects of music:
– NP1. Pitches (how high or low): on vertical axis
– NP2. Durations (how long): indicated by note/rest shapes
– NP3. Loudness: indicated by signs like p , mf , etc.
– NP4. Timbre (tone quality): indicated with words like “violin”,
“pizzicato”, etc.
– Start times: on horizontal axis
– Voicing: mostly indicated by staff; in complex cases also shown by stem
direction, beams, etc.
•
See “Essentials of Music Reading” musical example.
41
Complex Notation (Selfridge-Field’s Fig. 1-4)
Complications on staff 2:
• Editorial additions (small notes)
• Instruments sharing notes only some of the time
• Mixed durations in double stops
• Multiple voices (divisi notation)
• Rapidly gets worse with more than 2!
10 Feb.
42
Complex Notation (Selfridge-Field’s Fig. 1-4)
Multiple voices rapidly gets worse with more than 2
• 2 voices in mm. 5-6: not bad: stem direction is enough
• 3 voices in m. 7: notes must move sideways
• 4 voices in m. 8: almost unreadable—without color!
• Acceptable because exact voice is rarely important
rev. 12 Feb.
43
Domains of Musical Information
•
Independent graphic and performance info common
– Cadenzas (classical), swing (jazz), rubato passages (all music)
•
CMN “counterexamples” show importance of independent graphic and logical
info
– Debussy: bass clef below the staff
– Chopin: noteheads are normal 16ths in one voice, triplets in another
•
Mockingbird (early 1980’s) pioneered three domains:
– Logical: “ note is a qtr note” (= ESF(Selfridge-Field)’s “notation”)
– Performance: “ note sounds for 456/480ths of a quarter” (= ESF’s “sound”; also
called gestural)
– Graphic: “ notehead is diamond shaped” (= ESF’s “ notation”)
– Nightingale and other programs followed
•
SMDL added fourth domain
– Analytic: for Roman numerals, Schenkerian level, etc. (= ESF’s “analytic”)
1 Feb. 06
44
Representing Voicing in MIDI Files vs. Notation
MIDI File
Music Notation
Explicit via tracks (max. of Mostly explicit via
16)
staves, stem direction,
voice-leading lines
Implicit via patch, etc.
Mostly implicit via stem
direction, beaming, slurs,
alignment, etc.
20 Feb. 07
45
Representing Basic Parameters in MIDI Files Vs.
Notation
Timing
(incl.
duration)
MIDI File
Music Notation
Metric or time-code-based time
Metric. Relative duration via
notehead shape, aug. dots, tuplets,
fermatas, etc.; relative time from
alignment. Tempo & metronome
marks
Metric: ticks per quarter note
(e.g., 480, 1024)
Time-code-based: can be
SMPTE or millisec.
Encoding as delta time, to save
space
Pitch
Note number = piano key, plus
(global) pitchbend
No distinction between spellings
Spelling with accidentals, including
double & (very rarely) triple
Dynamics
Velocity: on (attack) & off
(release)
pppp… to ffff…, hairpins, text
markings, accents, etc.
Timbre
Patch no., aftertouch, off velocity Instrument name, text markings,
symbols, etc.
20 Feb. 07
46
Communicating about Music
• Basic principle of communicating with people: say just
what’s necessary
• Strunk & White: “Omit needless words”
• Applies to a lecture or a notation
22 Feb. 07
47
Representation, from Abstract to Concrete
• Cf. Basic Representations of Music & Audio
• Abstract: represention: semantics only
• Intermediate: syntax (mapping rules)?
• Concrete
– for use by computers: encoding
– for use by humans: if visual, notation (involves graphics and/or
typography)
• Analogous to knowledge representation vs. data structure
48
Semantics in Music
• Denotation (explicit, well-defined)...
• vs. Connotation (implicit, ill-defined)
• In text
– Two “definitions” of pig:
• 1. Ugh! Dirty, evil-smelling creatures, wallowing in filthy
sties! (Hayakawa)
• 2. Mammal with short legs, cloven hoofs, bristly hair, and a
cartilaginous snout used for digging (Amer. Heritage)
– Prose is “mostly” denotation
– Poetry is art => connotation much more important
• Music is always art, & only connotation!
– What is a musical idea?
• Major issue for content-based music IR
49
From Representation to Notation
• Choosing a representation inevitably introduces bias
• Given a representation, choosing notation inevitably
introduces more bias
• Important to consider the purpose (R. Davis et al; Wiggins
et al)
• For huge body of important music, we have no choice:
notation is CMN (Conventional Music Notation)!
–
–
–
–
Really “CWMN” (W = Western)
Alternative for some music: tablature (guitar, lute, etc.)
CMN is among the most successful notations ever...
but enormously complex and subtle
50
Review: How People Find Text Information
Query
Database
understanding
understanding
Database
concepts
Query
concepts
matching
Results
•What user wants is almost always concepts…
•But computer can only recognize words
51
Review: How Computers Find Text Information
Query
Database
Stemming, stopping,
query expansion, etc.
(no und ersta ndin g)
(no und ersta ndin g)
matching
Results
•“Stemming, stopping, query expansion” are all tricks to increase
precision & recall (avoid false negatives & false positives) due to
synonyms, variant forms of words, etc.
52
Notation Says Much about Representation
• CMN standard for Western music after ca.1650
• Evolved for “classical” music, but heavily used for
very wide range (pop, jazz, folk, etc.)
• Composers/arrangers/transcribers have pushed it
hard => reveals things about music representation
in general
• Will concentrate on notation (CMN)
53
Problems: Example 1 (superficial but
interesting)
•
•
•
•
Ravel work has slur with 7 inflection points
Impressive, but complexity is purely graphical
No big deal in terms of representation
…but influence of performance on notation is
revealing
54
Duration and Higher-Level Concepts of Time
•
•
•
•
Schubert Impromptu (& e 4)
Measures: everything between barlines
Time signature: 3/4 = 3 quarter notes per measure
Triplets: 3 notes in the time normally used by 2
– General concept is tuplets
55
Problems: Example 2 (Deep)
• Chopin Nocturne has nasty situation (& e 5)
• One notehead is triplet in one voice, but normal
duration in another
• “Semantics” (execution) well-defined, obvious
– Note starts 1/16 before barline…
– But also (2/3)*(1/16) before barline! How to play?
• Reason: musical necessity
• Solution for performer: “rubato”
• Solution for music IR program: ?
56
Problems: Example 3 (Medium)
• Bach: time signature change in middle of measure
•
(& e 6)
• “Semantics” well-defined and obvious
– Measure has duration of 18 16ths…
– But not until the middle of the measure!
• How does this make sense?
• Triplets express same relationship as equivalent
simple/compound meter
• Invisible (unmarked) triplets
• Cf. Bach Prelude: two time signatures at once (&
• Reason: avoid clutter
e 7)
57
Problem 4 (Medium)
•
•
•
•
•
Brahms Capriccio (& e 8)
Time signature 6/8 => measure lasts 12 16ths
A dotted half note always lasts 12 16ths…
but here it clearly lasts only 11 16ths!
Reason: avoid clutter
58
Two Ways to Have Two Clefs at Once
• Clef gives vertical offset to determine pitch
• Debussy (& e 9)
– Bizarrely obvious something odd involving clefs
• Ravel (&
e 10)
– Only comparing time signature (3/8) and note durations
makes it clear both clefs affect whole measure
• Reason: save space (by avoiding a 3rd staff)
59
Surprise: Music Notation has Meta-Principles!
(1)
1. Maximize readability (intelligibility)
–
–
–
–
Avoid clutter = “Omit Needless Symbols”
Try to assume just the right things for audience
Audience for CMN is (primarily) performers
General principle of any communication
• Applies to talks as well as music notation!
– Examples: Schubert, Bach, Brahms
60
Surprise: Music Notation has Meta-Principles!
(2)
2. Minimize space used
– Save space => fewer page turns (helps
performer); also cheaper to print (helps publisher)
– Squeezing much music into little space is a major
factor in complexity of CMN
– Especially important for music: real-time, hands
full
– Examples: Telemann, Debussy, Ravel
61
The “Rules” of Music Notation
• Tempting to assume that rules of such an elaborate &
successful system as CMN work (self-consistent,
reasonably unambiguous, etc.) in every case
• But (a) “rules” evolved, with no established authority; (b)
many of the “rules” are very nebulous
• In common cases, there's no problem
• If you try to make every rule as precise as possible, result
is certainly not self-consistent
• Trying to save space makes rules interact; something has to
give!
62
Music Notation Software and Intelligence
• Despite odd notation, really nothing strange going on in
almost all of these examples
– Ravel slur, Debussy & Ravel 2 simultaneous clefs, Bach &
Schubert invisible triplets, Brahms “short” dotted-half note,
Telemann 4 voices/staff are all simple situations
– Chopin Nocturne is complex
• Programmers try to help users by having programs do
things “automatically”
• A good idea if software knows enough to do the right thing
“almost all” the time—but no program does!
• Notation programs convert CMN to performance (MIDI)
and vice-versa => requires shallow “semantics”; makes
things much harder
63
Conclusions: Review (1)
•
•
•
•
Representations express Semantics
Semantics of Music; Denotation & Connotation
Principles of CMN
Meta-Principles of CMN
1. Maximize readability; Omit Needless Symbols
• Try to assume just the right things for audience
• General principle of any communication
2. Minimize space used
• Save space => fewer page turns, less paper
64
Conclusions: Review (2)
• We need CMN or equivalent to solve spectrum of
music-IR (and other music-IT) problems
– But CMN can’t represent everything we want
– Even when it can, actually may not (esp.
explicitly)
– Need high-level intelligence to interpret
• Solution: unknown
– Likely to require major funding :-)
65
Why Music-IR Research is Important
(Outside of Music)
• Some problems directly related to other areas of
informatics
– Example: Approximate string matching in
bioinformatics
• Encourages progress on real semantics
– Connotation is an important part of meaning in
everything
– Can often ignore, but any semantics in arts forces you
to deal with connotation
– Music is at least as quantifiable as any art, so likely to
be more tractable than others!
66
Different Classifications of Music Encodings
Selfridge-Field
Sound-related codes (1): M IDI
Sound-related codes (2): Other Codes for
Representation and Control
Musical Notation Codes (1): DARMS
Musical Notation Codes (2): Other ASCII
Representations
Musical Notation Codes (3): Graphical-object
Descriptions
Musical Notation Codes (4): B raille
Codes for Data Management and Analys is (1):
Monophonic Representations
Codes for Data Management and Analys is (2):
Polyphonic Representations
Representations of Musical Patterns and
Processes
Interchange Codes
10 Feb.
Byrd
Time-stamped MIDI
Time-stamped Events + Audio
CMN (domains L, G)
CMN (domains L, G)
CMN (domains L, P, G)
CMN: non-computer
representation!
CMN (emphasizes domain A)
CMN (emphasizes domain A)
“CMN” (abstracted; emphasizes
A)
CMN (domains L, P, G, A)
67
Mozart: Variations for piano, K. 265, on
“Ah, vous dirais-je, Maman”, a.k.a. Twinkle
Theme
2 œ œ
&4
2
? 4 œ
œ
Variation 2
&
?
`
œ
œ
œ œ
œ œ
œ œ
œ œ
œ œ
œ œ
Ý
œ œ
œ œ
œ œ
œ œ
œ œ
œ œ
Ý
œ
œœ
œ
œ
œ
œ
œ
œ
œ
ݜ
Ý
œ
ݜ
Ý
œ
œœœœœœœ œœœœœœœ œœœœœœœ œœœœœœœ œ#œœ œ#œœ
œ
œ
œ
œ
œ
œ
œœ
& œ
œœ œ
? œ
68