Transcript Document

Music Representation: Notation,
Conversion, & Acquisition
Donald Byrd
18 Oct. 2006
Copyright © 2006, Donald Byrd
Review: Representations of Music
• Three basic forms (representations) of music
– Audio: most important for most people (general public)
– MIDI files: often best/essential for some musicians,
especially for pop, rock, film/TV
– Notation: often best/ essential for musicians (even
amateurs) & music scholars
– Essential difference: how much explicit structure
• Music holdings of Library of Congress: over 10M
items
– Includes over 6M pieces of sheet music and 100K’s of
scores of operas, symphonies, etc.: all notation!
• Differences are profound
rev. 8 Sep. 2006
2
Review: Basic Representations of Music &
Audio
Digital Audio
Audio (e.g., CD, MP3):
like speech
Time-stamped
Time-stamped
Events Events
(e.g., MIDI file): like
unformatted text
Musiclike
Notation
Music Notation:
text with complex
formatting
1 Sep. 2006
3
Basic Representations of Music & Audio
Audio
Time-stamped Events
Music Notation
Common examples
CD, MP3 file
Standard MIDI File
Sheet music
Unit
Sample
Event
Note, clef, lyric, etc.
Explicit structure
none
little (partial voicing
information)
much (complete
voicing information)
Avg. rel. storage
2000
1
10
Convert to left
-
OK job: easy
Good job: hard
OK job: easy
Good job: hard
Convert to right
1 note: fairly hard
Other: very hard
OK job: hard
Good job: very hard
-
Ideal for
music
bird/animal sounds
sound effects
speech
music
music
rev. 4 Oct. 2006
4
Review: The Four Parameters
of Notes
• Four basic parameters of a definite-pitched
musical note
1. pitch: how high or low the sound is: perceptual analog
of frequency
2. duration: how long the note lasts
3. loudness: perceptual analog of amplitude
4. timbre or tone quality
• Above is decreasing order of importance for
most Western music
• …and decreasing order of explicitness in CMN!
30 Aug. 2006
5
How to Read Music Without Really Trying
• CMN shows at least six aspects of music:
– NP1. Pitches (how high or low): on vertical axis
– NP2. Durations (how long): indicated by note/rest
shapes
– NP3. Loudness: indicated by signs like p , mf , etc.
– NP4. Timbre (tone quality): indicated with words like
“violin”, “pizzicato”, etc.
– Start times: on horizontal axis
– Voicing: mostly indicated by staff; in complex cases
also shown by stem direction, beams, etc.
• See “Essentials of Music Reading” musical example.
30 Aug. 2006
6
4. Music vs. Text and Other Media
———— Explicit Structure ————
least
medium
most
Salience
increasers
Music
audio
events
notation
loud; thin texture
Text
audio (speech)
ordinary
text with markup
written text
“headlining”: large,
bold, etc.
Images
photo, bitmap
PostScript drawing-program
file
bright color
MPEG?
Premiere file
motion, etc.
MEDLINE abstracts
??
Video
videotape
w/o sound
Biological
data
DNA sequences,
3D protein structures
Classification: Surgeon General’s
Warning
• Classification (ordinary hierarchic) is dangerous
–
–
–
–
Almost everything in the real world is messy
Absolute correlations between characteristics are rare
Example: some mammals lay eggs; some are “naked”
Example: music genres (crossover, chorus + sax, etc.)
• People say “an X has features A, B, C, D…”
• Nearly always means “has feature A, and usually
also B, C, D…”
• Leads to:
– People who know better claiming absolute correlations
– Arguments among experts over which feature is most
fundamental
rev. 4 Oct. 06
8
Representation vs. Encoding
• Representation: what information is conveyed?
– More abstract (conceptual)
– Basic = general type of info; specific = exact type
• Encoding: how is the information conveyed?
– More concrete: in computer (“bits”)…or on paper
(“atoms”)!)
• One representation can have many encodings
– “Atoms” example: music notation in printed or Braille
form
– “Bits” example: any kind of text in ASCII vs. Unicode
– “Bits” example: formatted text in HTML, RTF, .doc
30 Jan. 06
9
Basic and Specific Representations vs.
Encodings
Basic and Specific Representations (above dotted line)
Audio
Time-stamped Events
Waveform
Time-stamped MIDI
Csound score
Time-stamped expM IDI
.WAV
Red Book (CD)
SMF
Csound score
Music Notation
Gamelan not.
Notelist
expM IDI File
Tablature
CM N
M ensural not.
M usicXM L
Finale
ETF
Encodings (below dotted line)
rev. 15 Feb.
10
Selfridge-Field on Describing Musical
Information
•
Cf. Selfridge-Field, E. (1997). Describing Musical Information.
• What is Music Representation? (informal use of term!)
– Codes in Common Use: solfegge (pitch only), CMN, etc.
– “Representations” for Computer Application: “total”, MIDI
• Parameters of Musical Information
– Contexts: sound, notation/graphical, analytic, semantic;
gestural?
– Concentrates on 1st three
• Processing Order: horizontal or vertical priority
• Code Categories
–
–
–
–
–
Sound Related Codes: MIDI & other
Music Notation Codes: DARMS, SCORE, Notelist, Braille!?, etc.
Music Data for Analysis: Plaine & Easie, Kern, MuseData, etc.
Representations of Musical Patterns & Process
Interchange Codes: SMDL, NIFF, etc.; almost obsolete!
30 Jan. 06
11
Domains of Musical Information
• Independent graphic & performance info common
– Cadenzas (classical), swing (jazz), rubato passages (all music)
• CMN “counterexamples” show importance of independent
graphic and logical info
– Debussy: bass clef below the staff
– Chopin: noteheads are normal 16ths in one voice, triplets in
another
• Mockingbird (early 1980’s) pioneered three domains:
– Logical: “ note is a qtr note” (= ESF(Selfridge-Field)’s
“notation”)
– Performance: “ note sounds for 456/480ths of a quarter” (=
ESF’s “sound”; also called gestural)
– Graphic: “ notehead is diamond shaped” (= ESF’s “ notation”)
– Nightingale and other programs followed
• SMDL added 4th domain
– Analytic: for Roman numerals, Schenkerian level, etc. (= ESF’s
“analytic”)
1 Feb. 06
12
Different Classifications of Music
Encodings
Selfridge -Field
Sound -related codes (1): M IDI
Sound -related codes (2): Other Codes for
Representation and Control
Musical Notation Codes (1): D ARMS
Musical Notation Codes (2): O ther ASCII
Representations
Musical Notation Codes (3): G raphical-obje ct
Descriptions
Musical Notation Codes (4): B raille
Codes for Data Management and Analysis (1):
Monophonic Representations
Codes for Data Management and Analysis (2):
Polyphonic Representations
Representations of Musical Patterns and
Processes
Interchange Codes
10 Feb.
Byrd
Time-stamped MIDI
Time-stamped Events + Audio
CMN (domains L, G)
CMN (domains L, G)
CMN (domains L, P, G)
CMN: non- computer
representation!
CMN (emphasizes domain A)
CMN (emphasizes domain A)
“CMN” (abstracted; emphasizes
A)
CMN (domains L, P, G, A)
13
Music Notation Software and
Intelligence (1)
•
Cf. Byrd, D. (1994). Music Notation Software and Intelligence.
• Cases where famous composers flagrantly violate
important rules, yet results are easily readable
Fig. 1. Changing time signature in middle of the measure (J.S.
Bach)
Fig. 2. A measure with four horizontal positions for notes that
are all on the downbeat (Brahms)
Very different ways to have two clefs in effect at the same time:
Fig. 3. Bizarrely obvious (Debussy)
Fig. 4. So subtle, must think about the 3/8 meter to see bass
and treble clefs are both in effect throughout the measure
(Ravel)
• Really nothing very strange going on in any of these
rev. 15 Feb.
14
Music Notation Software and
Intelligence (2)
• Rules of CMN interact and aren’t always consistent
• Programmers try to help users by having programs do
things “automatically”
• A good idea if software knows enough to do the right
thing “almost all” the time
• Notation programs convert CMN to performance (MIDI)
and vice-versa => makes things worse
• Severo Ornstein’s complaint: programs that assume a
defined rhythmic structure
22 Feb. 06
15
Surprise: Music Notation has MetaPrinciples!
1. Maximize readability (intelligibility)
–
–
–
–
Avoid clutter = “Omit Needless Symbols”
Try to assume just the right things for audience
Audience for CMN is (primarily) performers
General principle of any communication
• Applies to talks as well as music notation!
– Examples: Schubert (avoid tuplet numerals), Bach (avoid tuplets)
2. Minimize space used
– Save space => fewer page turns (helps performer); also cheaper
to print (helps publisher)
– Squeezing much music into little space is a major factor in
complexity of CMN
– Especially important for music: real-time, performer’s hands
full
– Examples: Telemann, Debussy, Ravel (for all, reduce staves)
22 Feb. 06
16
Dimensions of Music Representations (1)
• Waveform
• Csound
Expressive
Completeness
• M usicXML
• Notelist
• M IDI (SM F)
Structural Generality
(After Wiggins et al (1993). A Framework for the Evaluation of Music
Representation Systems.)
rev. 3 Feb.
17
Dimensions of Music Representations (2)
• Expressive completeness
– How much of all possible music can the representation
express?
– Includes synthesized as well as acoustic sounds!
– Waveform (=audio) is truly “complete”
– Exception, sort of: conceptual music
• E.g., Celestial Music for Imaginary Trumpets (notes
on 100 ledger lines), Cage: 4’ 33” (of silence), etc.
• Structural generality
– How much of structure in any piece of music can it
express?
– Music notation with repeat signs, etc. still expresses
nowhere near all possible structure
30 Jan. 06
18
Representation Example: a Bit of Mozart
The first few measures of Variation 8 of the “Twinkle” Variations
27 Jan.
19
In Notation Form: Nightingale Notelist
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
%%Notelist-V2 file='MozartRepresentationEx' partstaves=2 0 startmeas=193
C stf=1 type=3
C stf=2 type=10
K stf=1 KS=3 b
K stf=2 KS=3 b
T stf=1 num=2 denom=4
T stf=2 num=2 denom=4
A v=1 npt=1 stf=1 S1 'Variation 8'
D stf=1 dType=5
N t=0 v=1 npt=1 stf=1 dur=5 dots=0 nn=72 acc=0 eAcc=3 pDur=228 vel=55 ...... appear=1
R t=0 v=2 npt=1 stf=2 dur=-1 dots=0 ...... appear=1
N t=240 v=1 npt=1 stf=1 dur=5 dots=0 nn=74 acc=0 eAcc=3 pDur=228 vel=55 ......
appear=1
N t=480 v=1 npt=1 stf=1 dur=5 dots=0 nn=75 acc=0 eAcc=2 pDur=228 vel=55 ......
appear=1
N t=720 v=1 npt=1 stf=1 dur=5 dots=0 nn=77 acc=0 eAcc=3 pDur=228 vel=55 ......
appear=1
/ t=960 type=1
N t=960 v=1 npt=1 stf=1 dur=4 dots=0 nn=79 acc=0 eAcc=3 pDur=456 vel=55 ......
appear=1
(etc. File size: 1862 bytes)
27 Jan.
20
Music Notation: Attempts at Standard
Encodings
• XML-based (concept of markup language)
– SGML = Standard Generalized Markup Language
– “Application” of SGML for music
• SMDL = Standard Music Description Language: early & v. powerful, but
a flop
– XML = eXtensible Markup Language is hugely popular
– Applications of XML for music
• MusicXML is by far most popular; most verbose (5 notes of Mozart =
270 lines!)
• MEI also significant; others include MusiXML, MNML, NIFFML, etc. etc.
• Castan’s site www.music-notation.info lists programs importing &
exporting each encoding
– Gives an idea of which are most important/popular
– MusicXML is hands-down winner; next are GUIDO, NIFF, SCORE (early
2006)
27 Feb. 06
21
An Event Form: Standard MIDI File (file
dump)
•
•
•
•
•
•
•
•
•
•
•
•
•
0:
16:
32:
48:
64:
80:
96:
112:
128:
144:
160:
176:
192:
4D54 6864
726B 0000
0402 0218
0055 00FF
6480 4840
3881 6480
904F 3883
3883 4880
804D 400D
FF03 0550
4140 0C90
6480 4440
00 .
0000
0014
0896
0305
0C90
4B40
4880
4F40
FF2F
6961
4330
0C90
0006
00FF
34FF
5069
4A38
0C90
4F40
1890
004D
6E6F
8164
4647
0001
5103
2F00
616E
8164
4D38
1890
4D38
5472
8F00
8043
8164
0003
0B70
4D54
6F00
804A
8164
4F38
8330
6B00
9041
400C
8046
01E0
C000
726B
9048
400C
804D
8360
8050
0000
2B81
9044
4001
5 Feb.
4D54
FF58
0000
3881
904B
400C
9050
4018
3200
6480
3181
FF2F
MThd.........‡MT
rk......Q..p¿..X
.....ñ4./.MTrk..
.U....Piano.êH8Å
dÄH@.êJ8ÅdÄJ@.êK
8ÅdÄK@.êM8ÅdÄM@.
êO8ÉHÄO@.êO8É`êP
8ÉHÄO@.êM8É0ÄP@.
ÄM@../.MTrk...2.
...Pianoè.êA+ÅdÄ
A@.êC0ÅdÄC@.êD1Å
dÄD@.êFGÅdÄF@../
22
An Event Form: Standard MIDI File
(interpreted)
•
Header format=1 ntrks=3 division=480
•
•
•
•
•
Track #1 start
t=0 Tempo microsec/MIDI-qtr=749760
t=0 Time sig=2/4 MIDI-clocks/click=24 32nd-notes/24-MIDI-clocks=8
t=2868 Meta event, end of track
Track end
•
•
•
•
•
•
•
•
Track #2 start
t=0 Meta Text, type=0x03 (Sequence/Track Name) leng=5
Text = <Piano>
t=0 NOn ch=1 num=72 vel=56
t=228 NOff ch=1 num=72 vel=64
t=240 NOn ch=1 num=74 vel=56
t=468 NOff ch=1 num=74 vel=64
(etc. File size: 193 bytes)
27 Jan.
23
MIDI (Musical Instrument Digital
Interface) (1)
• Invented in early 80’s
– Dawn of personal computers
– Designed as simple (& cheap to implement) real-time
protocol for communication between synthesizers
– Low bandwidth: 31.25 Kbps
• Top bit of byte: 1 = status, 0 = data
– Numbers usually 7 bits (range 0-127); sometimes 14 or more
• Message types
–
–
–
–
–
Channel Voice
Channel Mode
System Common
System Real-Time
System Exclusive
5 Feb. 06
24
MIDI (2)
• Important standard Events are mostly Channel Voice
msgs
– Note On: channel (1-16), note number (0-127), on velocity
– Note Off: channel, note number, off velocity
• Can change “voice” any time with Program Change msg
• A way around the 16-channel limit: cables
– may or may not correspond to a physical cable
– each cable supports 16 channels independent of others
– Systems with 4 (=64 channels) or 8 cables (=128) are
common
• MIDI Monitor allows watching MIDI in real time
– Freeware and open source!
5 Feb. 06
25
MIDI Sequencers
• Record, edit, & play SMFs (Standard MIDI Files)
• Standard views
– Piano roll
• often with velocity, controllers, etc., in parallel
– Event list
– Other: Mixer, “Music notation”, etc.
– Standard editing
• Adding digital audio
– Personal computers & software-development tools have
gotten more & more powerful
– => "digital audio sequencers”: audio & MIDI (stored in hybrid
encodings)
• Making results more musical: “Humanize”
– Timing, etc. isn’t mechanical—but not really musical!
8 Feb. 06
26
Another Warning: Terminology (1)
• A perilous question: “How many voices does this
synthesizer have?”
• Syllogism
– Careless and incorrect use of technical terms is
dangerous to your learning very much
– Experts use technical terms carelessly most of the time
– Beginners often use technical terms incorrectly
– Therefore, your learning very much is in danger
• Somewhat exaggerated, but only somewhat
5 Feb. 06
27
Another Warning: Terminology (2)
• Not-too-serious case: “system”
– Confusion because both standard (common) computer
term & standard (rare but useful) music term
• Serious case: patch, program, timbre, or voice
– Vocabulary def.: Patch: referring to event-based systems such as MIDI
and most synthesizers (particularly hardware synthesizers), a setting
that produces a specific timbre, perhaps with additional features. The
terms "voice", "timbre", and "program" are all used for the identical
concept; all have the potential to cause substantial confusion and
should be avoided as much as possible
– “Patch” is the only unambiguous term of the four
– …but the official MIDI specification (& almost everything
else) talks about “voices” (as in “Channel Voice messages
control the instrument's 16 voices”)
– …and to change the “voice”, you use a “program change”!
6 Feb. 06
28
Another Warning: Terminology (3)
• Some terminology is just plain difficult
• Example: “Representation” vs. “Encoding”
– Distinction: 1st is more abstract, 2nd more concrete
– …but what does that mean?
– Explaining milk to a blind person: “a white liquid...”
• Don’s precision involves being very careful with
terminology, difficult or not
– Vocabulary is important source
– Cf. other sources
– Contributions are welcome
6 Feb. 06
29
Standard MIDI Files (1)
•
•
•
•
•
File format = encoding
Standard approved in 1988
Very compact
Files made up of chunks with 4-character type
One Header chunk (“Mthd”)
– Gives format, number of tracks, basis for timing
• Any number of Track chunks (“MTrk”)
– Stream of MIDI events and metaevents preceded by time
– 1st track is always timing track
5 Feb.
30
Standard MIDI Files (2)
• Metaevents
– Set Tempo (in timing track only)
– Text, Lyrics, Key/time signatures, instrument name, etc.
• What’s missing?
– Voice information limited to 16 channels
– Dynamics, beams, tuplets, articulation, expression marks,
note spelling, etc.: much less structure than CMN
• Attempts to overcome limitations
–
–
–
–
Expressive MIDI, NotaMIDI, etc.
ZIPI
In a (more ambitious) way, Csound, etc.
None of the limited attempts caught on
5 Feb. 06
31
Separating Representations Doesn’t
Work! (1)
• OK, I’m being overdramatic
– Really “doesn’t work well for many purposes”
• We shouldn’t be surprised
– Close relative of “Classification is Dangerous to Your
Health”
• Example: many popular notation encodings (e.g.,
MusicXML) add event info
• Example: multiple domains for notation add in
event info (performance domain)
• Example: Csound combines audio & events
• Hybrid systems
12 Feb. 06
32
Separating Representations Doesn’t
Work! (2)
• Extreme example of musical necessity: Jimi
Hendrix’s version of the Star-Spangled Banner at
Woodstock (1969)
– Goes from pure melody => noteless texture => back
repeatedly
– Cf. 2 kinds of notation: CMN & tablature
– What would music-IR system do to recognize the StarSpangled Banner?
– …or Taps? (a very different problem!)
• Attempts have been/are being made to combine
all three basic representations
3 Feb. 06
33
Even One Note can be Hairy
• Experience in the early days of Kurzweil (ca.
1983)
– Piano middle C(!) never sounded “good”
• ...except first, low-quality recording
• Couldn’t tell why from waveform, spectrogram, etc.
– Variable sampling rates were unusable
• An expensive mistake: cost ca. $1,000,000
– Scale on the flute didn’t sound realistic to a flutist—
but it was
– Lesson 1: expectations influence perception
– Lesson 2: nothing about music is clear-cut or simple
30 Aug. 2006
34