Transcript mirIntro

Introduction to
Music Information Retrieval (MIR)
J.-S. Roger Jang (張智星)
MIR Lab, CSIE Dept., National Taiwan Univ.
http://mirlab.org/jang
2016/4/4
1
How to Search for a Song?
Content-based search
Melody
Mood
Genre
Instrument
Chords
Cover song
島谷ひとみ / 亜麻色の髪の乙女
-3-
Types of Search of MIR Systems
Metadata-based  Easier
Song title, artist, tags, composer, …
Query input: text or speech
Content-based  Harder
Melody, lyrics, mood, genre, chord, instruments, …
Query input:
Symbolic: notes, chord, text, …
Acoustic: Singing, humming, whistling, tapping, speech,
recording of exact example, beatboxing…
-4-
Types of Acoustic Inputs for MIR
Singing/humming
Query by humming
(usually “ta” or “da”)
Query by singing
Whistling
Query by whistling
Wolf whistle
Hand whistle
Fingerless whistle
Leaf whistle
Tapping
Query by tapping (at the
onsets of notes)
Speech
Query by speech (for
lyrics or meta-data)
Exact but noisy example
Query by example (noisy
version of original clips)
Beatboxing
-5-
Types of Contents for Comparison
Melody
Query by humming
Query by singing
Query by whistling
Note onsets
Query by tapping (at the
onsets of notes)
Meta-data
Query by speech
Audio contents
Query by examples
(noisy versions of
original clips)
Drum patterns
Query by beatboxing
-6-
Questions or comments for MIR?
-7-
Introduction to QBSH
QBSH: Query by Singing/Humming
Input: Singing or humming from microphone
Output: A ranking list retrieved from the song
database
Progression
First paper: Around 1994
Extensive studies since 2001
State of the art: QBSH tasks at ISMIR/MIREX,
since 2006
-8-
Challenges in QBSH Systems
Reliable pitch tracking for acoustic input
Input from mobile devices or noisy karaoke bar
Song database preparation
MIDIs, singing clips, or audio music
Efficient/effective retrieval
Karaoke machine: ~10,000 songs
Internet music search engine: ~500,000,000 songs
-9-
-10-
Two Types of Processing for QBSH
Offline processing
Collect pitch
vectors for DB
Pure vocals
MIDI files
Pitch obtained from
polyphonic music
Label the anchor
position
Identify repeated
patterns
Online processing
Compute pitch from
the user’s query
Convert pitch into
note sequence
(optional)
Compare with DB
List the ranked
result
-11-
Flowchart of QBSH
On-line processing
Microphone
input
Frame-based
representation
MIDI files
Filtering
Similarity
comparison
Pitch
tracking
Pitch vector
smoothing
Query results
(Ranked song list)
Melody track
extraction
Off-line processing
-13-
Short Latency and Strategies
Goal: To retrieve songs effectively within a
given response time, say 5 seconds or so
Our strategies
Multi-stage progressive filtering
Indexing for different comparison methods
Repeating pattern identification
Platform upgrade: GPU is the way to go!
-14-