Transcript mirIntro
Introduction to
Music Information Retrieval (MIR)
J.-S. Roger Jang (張智星)
MIR Lab, CSIE Dept., National Taiwan Univ.
http://mirlab.org/jang
2016/4/4
1
How to Search for a Song?
Content-based search
Melody
Mood
Genre
Instrument
Chords
Cover song
島谷ひとみ / 亜麻色の髪の乙女
-3-
Types of Search of MIR Systems
Metadata-based Easier
Song title, artist, tags, composer, …
Query input: text or speech
Content-based Harder
Melody, lyrics, mood, genre, chord, instruments, …
Query input:
Symbolic: notes, chord, text, …
Acoustic: Singing, humming, whistling, tapping, speech,
recording of exact example, beatboxing…
-4-
Types of Acoustic Inputs for MIR
Singing/humming
Query by humming
(usually “ta” or “da”)
Query by singing
Whistling
Query by whistling
Wolf whistle
Hand whistle
Fingerless whistle
Leaf whistle
Tapping
Query by tapping (at the
onsets of notes)
Speech
Query by speech (for
lyrics or meta-data)
Exact but noisy example
Query by example (noisy
version of original clips)
Beatboxing
-5-
Types of Contents for Comparison
Melody
Query by humming
Query by singing
Query by whistling
Note onsets
Query by tapping (at the
onsets of notes)
Meta-data
Query by speech
Audio contents
Query by examples
(noisy versions of
original clips)
Drum patterns
Query by beatboxing
-6-
Questions or comments for MIR?
-7-
Introduction to QBSH
QBSH: Query by Singing/Humming
Input: Singing or humming from microphone
Output: A ranking list retrieved from the song
database
Progression
First paper: Around 1994
Extensive studies since 2001
State of the art: QBSH tasks at ISMIR/MIREX,
since 2006
-8-
Challenges in QBSH Systems
Reliable pitch tracking for acoustic input
Input from mobile devices or noisy karaoke bar
Song database preparation
MIDIs, singing clips, or audio music
Efficient/effective retrieval
Karaoke machine: ~10,000 songs
Internet music search engine: ~500,000,000 songs
-9-
-10-
Two Types of Processing for QBSH
Offline processing
Collect pitch
vectors for DB
Pure vocals
MIDI files
Pitch obtained from
polyphonic music
Label the anchor
position
Identify repeated
patterns
Online processing
Compute pitch from
the user’s query
Convert pitch into
note sequence
(optional)
Compare with DB
List the ranked
result
-11-
Flowchart of QBSH
On-line processing
Microphone
input
Frame-based
representation
MIDI files
Filtering
Similarity
comparison
Pitch
tracking
Pitch vector
smoothing
Query results
(Ranked song list)
Melody track
extraction
Off-line processing
-13-
Short Latency and Strategies
Goal: To retrieve songs effectively within a
given response time, say 5 seconds or so
Our strategies
Multi-stage progressive filtering
Indexing for different comparison methods
Repeating pattern identification
Platform upgrade: GPU is the way to go!
-14-