Organization of Musical Information

Download Report

Transcript Organization of Musical Information

Music Representation,
Searching, and Retrieval
(a.k.a. Organization of and Searching in Musical
Information)
Donald Byrd
School of Informatics & School of Music
Indiana University
17 February 2008
1
Overview
1. Introduction and Motivation
2. Basic Representations
3. Why is Musical Information Hard to Handle?
4. Music vs. Text and Other Media
5. Programming in R
5. OMRAS and Other Projects
6. Summary
rev. Jan. 2006
2
Classification: Logician General’s Warning
• Classification is dangerous to your understanding
– Almost everything in the real world is messy, ill-defined
– Absolute correlations between characteristics are rare
• Example: some mammals lay eggs; some are “naked”
• Example: was the 1st piano Cristofori's, Broadwood's, another?
– People say “an X has characteristics A, B, C…”
– Usually mean “an X has A, & usually B, C…”
– Leads to:
• People who know better claiming absolute correlations
• “Is it this or that or that?” questions that don’t have an answer
• Don changing his mind
• But lack of classification is dangerous to understanding!
• So should we abandon (hierarchic) classifications?
– Of course not; they're much too useful
– Besides, impossible to avoid: they're built into our ways of thinking
– Best policy: be on guard for misleading things; be aware of alternatives
– Good alternative: "extended" hierarchies (e.g., DAGs instead of trees)
23 Jan. 08
3
1. Introduction and Motivation (1)
• The Fundamental Theorem of Music Informatics
(maybe)
– Music is created by humans for other humans
• Humans can bring tremendous amount of contextual knowledge to
bear
• In fact, they can't avoid it, and they're rarely conscious of it
– But (as of early 2008) computers can never bring much contextual
knowledge to bear, often none, & never without being specifically
programmed to
– => doing almost anything with music by computers is very
difficult; many problems essentially intractable
– For the forseeable future, only way to make significant progress is
by doing as well as possible with little context, sidestepping
intractable problems
• Not a theorem (I recently made it up), but important
11 Jan. 08
4
1. Introduction and Motivation (2)
• How can we “do as well as possible with little context” in order to
“sidestep intractable problems”?
–
–
–
–
Round off corners; make everything fuzzy
…but has to be the right kind of fuzziness!
A popular technique: approximate matching
w/ music in symbolic form, approximate string matching vs. approximate
geometric matching
– Another popular technique: statistical methods
• Ex: best-match searching (w/ ranked results) vs. exact-match searching
• Blair & Maron’s early prediction of disaster in text IR based partly on
assumption of exact matching
• Ex: identifying the actual notes in polyphonic audio is extremely difficult
• …but identifying chroma is much easier, & often good enough to get useful
results, esp. w/ proper statistical methods
4 Feb. 08
5
1. Introduction and Motivation (3)
• Three basic forms (representations) of music are important
– Audio: most important for most people (general public)
• All Music Guide (www.allmusicguide.com) has info on >>230,000 CD’s
– MIDI files: often best or essential for some musicians, especially
for pop, rock, film/TV
• Hundreds of thousands of MIDI files on the Web
– CWMN (Conventional Western Music Notation): often best or
essential for musicians (even amateurs) & music researchers
• Music holdings of Library of Congress: over 10M items
– Most is notation, especially CWMN, not audio
– Includes over 6M pieces of sheet music and tens/hundreds of
thousands of scores of operas, symphonies, etc.
• Differences among the forms are profound
– NB: all statistics above are several years old
rev. 11 Jan. 08
6
2. Basic Representations of Music & Audio (1)
Digital Audio
Audio (e.g., CD, MP3):
like speech
Time-stamped
Time-stamped
Events Events
(e.g., MIDI file): like
unformatted text
Musiclike
Notation
Music Notation:
text with complex
formatting
27 Jan.
7
Basic Representations of Music & Audio (2)
Audio
Time-stamped Events
Music Notation
Common examples
CD, MP3 file
Standard MIDI File
Sheet music
Unit
Sample
Event
Note, clef, lyric, etc.
Explicit structure
none
little (partial voicing
information)
much (complete
voicing information)
Avg. rel. storage
2000
1
10
Convert to left
-
easy
OK job: easy
Convert to right
1 note: pretty easy
OK job: fairly hard
other: hard or very hard
-
Ideal for
music
bird/animal sounds
sound effects
speech
music
music
27 Jan.
8
Basic Representations of Music & Audio (3)
Audio: no (explicit) structure
Events/MIDI: simple structure
Notation: very complex structure
2 Oct. 07
9
The Four Parameters of Notes
• Four basic parameters of a definite-pitched
musical note
1. pitch: how high or low the sound is: perceptual analog
of frequency
2. duration: how long the note lasts
3. loudness: perceptual analog of amplitude
4. timbre or tone quality
• Above is decreasing order of importance for most
Western music
• …and decreasing order of explicitness in CWMN!
10
How to Read Music Without Really Trying
• CWMN shows at least six aspects of music:
–
–
–
–
NP1. Pitches (how high or low): on vertical axis
NP2. Durations (how long): indicated by note/rest shapes
NP3. Loudness: indicated by signs like p , mf , etc.
NP4. Timbre (tone quality): indicated with words like
“violin”, “pizzicato”, etc.
– Start times: on horizontal axis
– Voicing: mostly indicated by staff; in complex cases also
shown by stem direction, beams, etc.
• See “Essentials of Music Reading” musical example
– At the other extreme, see the “Gallery of Interesting Music
Notation”!
• http://www.informatics.indiana.edu/donbyrd/InterestingMusicNotation.html
rev. 11 Jan. 08
11
How People Find Text Information
Query
Database
understanding
understanding
Database
concept s
Query
concept s
matching
Result s
•What user wants is almost always concepts…
•But computer can only recognize words
12
How Computers Find Text Information
Query
Database
Stemming, stopping,
query expansion, etc.
(no understanding )
(no understanding )
matching
Result s
•“Stemming, stopping, query expansion” are all tricks to increase
precision & recall (avoid false negatives & false positives) due to
synonyms, variant forms of words, etc.
13
3. Why is Musical Information Hard
to Handle?
1. Units of meaning: not clear there are any—assuming music
even has meaning! (all representations)
2. Polyphony: “parallel” independent voices, something like
characters in a play (all representations)
3. Recognizing notes (audio only)
4. Other reasons
– Musician-friendly I/O is difficult
– Diversity: of styles of music, of people interested in music
• Music is an art!
• Cf. Byrd & Crawford (2002)
• But what is the information, in the first place?
rev. 12 Jan. 08
14
Units of Meaning (Problem 1)
• Handling text information nearly always via words
– “What we want is concepts; what we have is words”
• Not clear anything in music is analogous to words
– No explicit delimiters (like Chinese)
– Experts don’t agree on “word” boundaries (unlike Chinese)
– Music is always art => “meaning” much more subtle!
•
•
•
•
Are notes like words?
No. Relative, not absolute, pitch is important
Are pitch intervals like words?
No. They’re too low level: more like characters
rev. Jan. 2007
15
Units of Meaning (Problem 1)
•
•
•
•
Are pitch intervals like words?
No. They’re too low level: more like characters
Are pitch-interval sequences like words?
In some ways, but
– Ignores rhythm
– Ignores relationships between voices (harmony)
– Probably little correlation with semantics
• Are chords like words? (Christy Keele)
– If so, chord progressions may be like sentences
– In some ways, but ignores melody & rhythm, most relevant for
tonal music, etc.
• Anyway, in much music, pitch isn’t important, and/or notes
aren’t important!
rev. Jan. 07
16
Independent Voices in Music
(Problem 2)
J.S. Bach: “St. Anne” Fugue, beginning
17
Independent Voices in Text
MARLENE. What I fancy is a rare steak. Gret?
ISABELLA. I am of course a member of the / Church of England.*
GRET. Potatoes.
MARLENE. *I haven’t been to church for years. / I like Christmas carols.
ISABELLA. Good works matter more than church attendance.
--Caryl Churchill: “Top Girls” (1982), Act 1, Scene 1
Performance (time goes from left to right):
M: What I fancy is a rare steak. Gret?
I:
G:
I haven’t been...
I am of course a member of the Church of England.
Potatoes.
18
Music Notation vs. Audio
• Relationship between notation and its sound is very subtle
• Not at all one symbol <=> one symbol
– Notes w/ornaments (trills, etc.) are one => many
– All symbols but notes are one => zero!
– Bach F-major Toccata example
• Style-dependent
–
–
–
–
Swing (jazz), dotting (baroque art music)
Improvisation (baroque art music, jazz)
“Events” (20th-century art music)
How well-defined is style-dependent
• Interpretation is difficult even for musicians
– Can take 50-90% of lesson time for performance students
19
Music Perception and Music IR
• Salience is affected by texture, loudness, etc.
– Inner voices in orchestral music rarely salient
• Streaming effects and cross-voice matching
– produced by timbre: Wessel’s illusion (Ex. 1, 2)
– produced by register: Telemann example (Ex. 3)
• Octave identities, timbre and texture
– Beethoven “Hammerklavier” Sonata example (Ex.4, 5)
– Affects pitch-interval matching
20
4. Music vs. Text and Other Media
———— Explicit Structure ————
least
medium
most
Salience
increasers
Music
audio
events
notation
loud; thin texture
Text
audio (speech)
ordinary
text with markup
written text
“headlining”: large,
bold, etc.
Images
photo, bitmap
PostScript drawing-program
file
bright color
MPEG?
motion, etc.
Video
videotape, DVD
w/o sound of movie
Biological DNA sequences,
data
3D protein structures
Premiere or Flash
file
MEDLINE abstracts ??
21
Features of Music: Text Analogies
• Simultaneous independent voices and texture
•
Analogy in text: characters in a play
• Chords within a voice
•
Analogy in text: character in a play writing something visible to
the audience while saying different out loud
• Rhythm
•
Analogy in text: rhythm in poetry
• Notes and intervals
•
•
•
Note pitches rarely important
Intervals more significant, but still very low-level
Analogy in text: interval = (very roughly!) letter, not word
22
Features of Text: Music Analogies
• Words
•
Analogy in music: for practical purposes, none
• Sentences
•
Analogy in music: phrases (but much less explicit)
• Paragraphs
•
Analogy in music: sections of a movement (but less explicit)
• Chapters
•
Analogy in music: movements
23
Course Overview
• II. Organization of Musical Information (music
representation)
– “What we want is concepts; what we have is words”
– Audio, MIDI, notation
• III. Finding Musical Information
– A Similarity Scale for Content-Based Music IR
• IV. Musical Similarity and Finding Music by Content
• V. Finding music via Metadata
– Digital music libraries (Variations2), iTunes, etc.
– Music recommender systems
Jan. 07
24
Programming in R: No Problem!
• R is very interactive: can use as powerful graphing calculator
• => easier to experiment with & learn, & useful that way
• Assignments will be fairly simple (though not for MusInfo &
CompSci grad students :-) )
• Much help available: from Don & other students
• Why R?
–
–
–
–
–
–
–
–
designed for statistics, but that’s NOT why!
easy to do simple things with it
easy to do many fairly complex things, incl. graphs & handling audio files
probably not good for really complex programs
free, & available for all popular operating systems
very interactive => easy to experiment
has good documentation
In use in other Music Informatics classes, & standardizing is good
11 Jan. 08
25
Rudiments of R
• Originally for statistics, but useful for far more
• To get R
– Web site: http://cran.us.r-project.org/
– Has lots of documentation (tutorials, manuals, etc.), too
– Versions for Linux, Mac OS X, Windows
– On all(?) STC computers
• Tutorial:
• http://www.informatics.indiana.edu/donbyrd/Teach/RTools+Docs//R_t
utorial_DAB.txt
• Can use R interactively as a powerful graphing, musicing, etc.
calculator
• …but it’s not perfect: sometimes very cryptic
rev. 20 Jan. 08
26
Music Recommender Systems
• Work by genre classification and/or collaborative filtering
• Major interest in recently
• Best known include:
– Pandora (cf. “Music Genome Project”)
– Last.fm
– MusicStrands
• Other interesting sites
– Hype Machine (“for savants”?)
11 Jan. 08
27
Typke’s MIR System Survey
• Rainer Typke’s “MIR Systems: A Survey of Music Information
Retrieval Systems” lists many systems
– http://mirsystems.info/
• Commercial system: Shazam
• Some research systems can be used over the Web, incl.:
–
–
–
–
–
–
–
C-Brahms
Meldex/Greenstone
Mu-seek
MusicSurfer
Musipedia/Tuneserver/Melodyhound
QBH at NYU
Themefinder
28
Machinery to Evaluate Music-IR Research
• Problem: how do we know if one system is really better
than another, or an earlier version?
• Solution: standardized tasks, databases, evaluation
– In use for speech recognition, text IR, question answering, etc.
• Important example: TREC (Text Retrieval Conference)
• For music IR, we now have...
• IMIRSEL (International Music Information Retrieval
Systems Evaluation Laboratory) project
– http://www.music-ir.org/evaluation/
• MIREX (Music IR Evaluation eXchange) modeled on
TREC
– 2005: audio only
– 2006, 2007: audio and symbolic
29
Collections (a.k.a. Databases) (1 of 2)
• Collections are improving, but very slowly
• For research: poor to fair
– “Candidate Music IR Test Collections”
• http://mypage.iu.edu/~donbyrd/MusicTestCollections.HTML
– Representation “CMN” vs. CMN (assume Western)
• For practical use: pathetic (symbolic) to good (pop audio)
– Most are commercial, especially audio
– Very little free/public domain
– …especially audio! (cf. RWC)
• IPR issues are a total mess
30
Collections (a.k.a. Databases) (2 of 2)
• Why is so little available?
–
–
–
–
–
Symbolic form: no efficient way to enter
Solution: OMR? AMR? research challenges
Music is an art!
Cf. “Searching CMN” slides: chicken & egg problem
IPR issues are a total mess
31
6. Summary (1 of 2)
• Basic representations of music: audio, events, notation
– Fundamental difference: amount of explicit structure
• Have very different characteristics => each is by far best
for some users and/or application
• Converting to reduce structure much easier than to add
• Music in all forms very hard to handle mostly because of:
– Units of meaning problem
– Polyphony
• Both problems are much less serious with text
• Huge range of searching tasks people want to do => very
different techniques appropriate
rev. Jan. 2006
32
6. Summary (2 of 2)
• Projects include
– Audio-based: via recognition of polyphonic music (OMRAS,
query-by-humming, etc.)
– CMN-based: monophonic query vs. polyphonic database
(emphasis on UI) (OMRAS)
– Style-genre identification from audio
– Music recommender systems (Pandora, Last.fm, etc.)
– Digital music libraries (Variations2, etc.; iTunes, sort of)
– Creative applications: music IR for improvisation, etc.
• Machinery to evaluate research is coming along (MIREX)
• Collections
–
–
–
–
for research: poor to fair
For practical use: pathetic (symbolic) to good (pop audio)
improving, but…
Serious problems with IPR as well as technology
rev. Jan. 2007
33