Music Information Retrieval

Download Report

Transcript Music Information Retrieval

Music Information Retrieval
-orhow to search for (and maybe
find) music and do away with incipits
Michael Fingerhut
Multimedia Library and
Engineering Bureau
IRCAM – Centre Pompidou
IAML - IASA 2004 Congress, Oslo
IRCAM - Institut de Recherche et Coordination Acoustique/Musique
IAML- International Association of Music Libraries
IASA – International association of sound archives
Presented by: Shailesh Deshpande ([email protected])
Why MIR?
Take 1: multi-disciplinary domain
Take 2: schematic
Take 3: typology
IRCAM cataloging tool
Music information retrieval (MIR) is the interdisciplinary
science of retrieving information from music
Paper presents three views of this domain
What is an incipit?
First few words or opening line of a book. In music – first few
notes of a composition.
Why MIR?
Storage => increased availability of musical content
in digital form (locally)
CD’s, DVD’s, iPods
Computing power => faster processing of large
volumes of digitized content
Networks => increased availability of musical
content in digital form (remotely)
 Pandora,
Yahoo Music, iTunes
Technological advances + demand from consumers =
attention of research and industry
Take 1: multi-disciplinary domain
Philosophy and Psychology
Computer Science, Data Processing, AI, Pattern Recognition, Library &
Information Sciences
Sensory Perception, Emotions & feelings, Mental processes & intelligence
Social Sciences
Sociology & Anthropology, Culture & Institutions, Law, Commerce
Natural Science & Mathematics
General Technology
Electric, Electronic, Magnetic, Communications & Computer Engineering
The Arts
Music, Aesthetics, Composition
Take 2: schematic representation of MIR
Take 3: a typology of MIR
OCR, digitization, compression
Encoding, notation
Feature extraction
Instrument recognition
Voice recognition
Melody, Key, Harmony, Rhythm
Structural analysis
Databases, systems, networks
Objective criteria
Metadata indices (name, title, period, genre,
Full-text (with or without semantic tags)
Query by example (audio excerpt, melody,
contour, rhythm, tonality, harmony)
Acoustical characteristics
Subjective criteria
Retrieve, deliver, use
Using and reusing (annotate, combine,
Rights management (recognition, watermarking)
User studies
Music terms used in MIR
Pitch – perceived fundamental frequency of a sound. Maybe different from actual
frequency because of harmonics.
Timbre – the quality of a musical note that distinguishes different types of sound
production, such as voices or musical instruments (saxophone vs. trumpet – with same
pitch and loudness)
Rhythm (aka beat) - the variation of the length and accentuation of a series of
Tempo – the speed or pace of a musical piece. Usually affects the Mood of a song.
Melody – a linear succession of musical tones which is perceived as a single entity
(‘horizontal’ aspect of music)
Harmony – simultaneous use of different pitches (‘vertical’ aspect of music)
Monophony – musical texture consisting of melody without accompanying harmony
Polyphony - is a texture consisting of two or more independent melodic voices
Common Methods
Modeling: start from a theory, look for patterns
 Look
for melodies, harmonic progressions
 Attempt to find elements in data that correspond to such
Statistical methods: look for patterns, build a theory
 Perform
statistical analysis on data, find common patterns
and group them in clusters
 Attempt to interpret their occurrence in musical pieces
MIR Challenges
The integration of audiovisual, symbolic and textual data
Fingerprinting - unique small set of features excerpted from a
sound file, allowing to discriminate it from any other sound file
Music Summarization- how to select a representative excerpt
that gives a good idea of the work (similar to thumbnails for
image files)
Computing Similarity – no unique way in which two pieces may
be similar
 Melodic, Rhythmic, Timbre, Genre, Style similarities
Indexing a musical piece by melody – to allow QBH interface
MIR Challenges contd..
Encoding of music – at acoustic, structural and semantic levels
Query-by-example – search for music by singing, humming,
whistling or playing an audio excerpt
Watermarking – adding identification information to digital
audio for DRM
Benchmarking - limited number of standardized test collections
available for evaluation of MIR systems
A tool to catalog and extract audio
CD contents for online distribution
Automatic identification of CDs
 Compute
CDDB of the CD
 CDDB - a binary number reflecting the offsets (start
time) and lengths of the tracks of the CD
Metadata retrieval and correction
 Query
Internet CDDB for metadata
 Allow correction
Extraction and compression
Transfer to a Web server
IRCAM tool interface
When a CD is inserted in the
-The tool computes its CDDB
- Retrieves the metadata if
available (,,
- Allows the librarian to
correct errors, structure the
tracks into works and select
names from authority lists.
- When done, it adds the
metadata to the catalog, and
extracts the tracks,
compresses them and sends
them to the audio server.
Information sources
The International Society for Music Information Retrieval
University of Illinois’ Graduate School of Library and
Information Science (
The Listen Game — UCSD Computer Audition Lab MIR music
ranking game (Herd It on Facebook)
Multi-player game where you listen to music with lots of other people
(aka the Herd). You are asked to describe the music (genre, mood,
singer etc.) and get points when the Herd agrees with you.
Innovative way to harness the power of social networking and collect
metadata for MIR