Music information retrieval systems

Download Report

Transcript Music information retrieval systems

MUSIC INFORMATION
RETRIEVAL SYSTEMS
Author: Amanda Cohen
Music Information Retrieval Systems


Based on content from the following webpage:
http://mirsystems.info/index.php?id=mirsystems
Other good sources on MIR and MIR systems
 http://www.music-ir.org
- Virtual home of music
information retrieval research
 http://www.ismir.net - The International Symposium on
Music Information Retrieval
Audentify!


Developers: F. Kurth, A. Ribbrock, M. Clausen
Relevant Publications:





Kurth, F., Ribbrock, A., Clausen, M. Identification of Highly Distorted Audio
Material for Querying Large Scale Data Bases. 112th Convention of the Audio
Engineering Society, May 2002, Munich, Convention Paper
Kurth, F., Ribbrock, A., Clausen, M. Efficient Fault Tolerant Search Techniques for
Full-Text Audio Retrieval. 112th Convention of the Audio Engineering Society,
May 2002, Munich, Convention Paper
Ribbrock, A. Kurth, F. A Full-Text Retrieval Approach to Content-Based Audio
Identification. International Workshop on Multimedia Signal Processing. St.
Thomas, US Virgin Islands, December 9-11, 2002
Kurth, F. A Ranking Technique for fast Audio Identification. International Workshop
on Multimedia Signal Processing. St. Thomas, US Virgin Islands, December 9-11,
2002
Clausen, M., Kurth, F. A Unified Approach to Content-Based and Fault Tolerant
Music Recognition, IEEE Transactions on Multimedia. Accepted for publication
Audentify

System Description
 Takes
signal queries (1-5 seconds, 96-128 kbps)
 Searches by audio fingerprint
 Returns a file ID that corresponds with a song in the
database
 Currently a part of the SyncPlayer system
SyncPlayer


Developers: F. Furth, M. Muller, D. Damm, C. Fremerey, A. Ribbrock,
M. Clausen
Relevant Publications:




Kurth, F., Müller, M., Damm, D., Fremerey, Ch. Ribbrock, A., Clausen, M.
SyncPlayer - An Advanced System for Multimodal Music Access,
Proceedings of the 6th International Conference on Music Information
Retrieval (ISMIR 2005), London, GB
Kurth, F., Müller, M., Ribbrock, A., Röder, T., Damm, D., Fremerey, Ch. A
Prototypical Service for Real-Time Access to Local Context-Based Music
Information. Proceedings of the 5th International Conference on Music
Information Retrieval (ISMIR 2004), Barcelona, Spain. http://wwwmmdb.iai.uni-bonn.de/download/publications/kurth-service-ismir04.pdf
Fremerey, Ch., SyncPlayer - a Framework for Content-Based Music
Navigation, Diplomarbeit at the Multimedia Signal Processing Group
Prof. Dr. Michael Clausen, University of Bonn, 2006, Bonn, Germany
URL: http://audentify.iai.uni-bonn.de/synchome/index.php?pid=01
SyncPlayer

System Description


Query type(s): audio files (mp3, wav, MIDI), lyrics, MusicXML,
Score scans (primary data)
Generates “derived data” from query




Submitted to SyncPlayer Server, which can perform three services
(at present)




extracts features
generates annotations
compiles synchronization data
audio identification (through audentify)
provide annotations for a given song
retrieval in lyrics annotation
SyncPlayer Client: audio-visual user interface, allow user to
playback, navigate, and search in the primary data
ChoirFish


Developers: A. Van Den Berg, S. Groot
Relevant Publications:
 Groot,
S., Van Den Berg, A., The Singing Choirfish: An
application for Tune Recognition, Proceedings of the
2003 Speech Recognition Seminar, LIACS 2003

URL:
http://www.ookii.org/university/speech/default.as
px
ChoirFish

System Description:
Query by humming
 Contour features used for matching
 Used Parson’s Code to determine contour

Code is based on the direction of note transitions
 3 characters for each possible direction:





R: The note is the same frequency as the previous note
D: The note is lower in frequency than the previous note
U: The note is higher in frequency than the previous note
Generated by changing the audio to the frequency domain via
Fast Fourier Transform and using the highest frequency peak to
determine pitch and pitch change
CubyHum


Developers: S. Pauws
Relevant Publications:
 Pauws,
S., CubyHum: A Fully Operational Query by
Humming System, ISMIR 2002 Conference Proceedings
(2002): 187--196, doi:10.1.1.108.8515
 PDF of paper:
http://ismir2002.ismir.net/proceedings/02-FP06-2.pdf
CubyHum

System Description:






Query by Humming: user queries system by humming the desired
song
Pitch is estimated by computing the sum of harmonically
compressed spectra (sub-harmonic summation, or SHS).
Musical events (note onsets, gliding tones, inter-onset-intervals)
are detected
Query is transformed via quantization into musical score, which is
used to create a MIDI melody for auditory feedback
Approximate pattern matching used to find matching song
Distance between melodies defined based on interval sizes and
duration ratios to compensate for imperfect query (people don’t
always hum the correct melody in the correct key)
Fanimae


Developers: Iman S.H. Suyoto, Alexandra L.
Uitdenbogerd and Justin Zobel
Relevant Publications
 Suyoto,
I.S.H., Uitdenbogerd, A.L., Simple efficient ngram indexing for effective melody retrieval,
Proceedings of the Annual Music Information Retrieval
Evaluation eXchange, 2005

URL: http://mirt.cs.rmit.edu.au/fanimae/
Fanimae

System Description:




Desktop Music Information Retrieval System
Search by symbolic melodic similarity
Query: a melody sequence that contains both pitch and duration
information
Melody sequence is standardized


Intervals are encoded as a number of semitones, with large intervals
being reduced
Coordinate matching used to detect melodic similarity



Query is split into n-grams of length 5, as are any possible answers
count the common distinct terms between query and possible answer
return results ranked by similarity
Foafing the Music


Developers: Music Technology Group of the
Universitat Pompeu Fabra
Relevant Publications:
 Celma,
O. Ramírez, M. Herrera, P., Foafing the music: A
music recommendation system based on RSS feeds and
user preferences Proceedings of 6th International
Conference on Music Information Retrieval; London, UK,
2005,
http://ismir2005.ismir.net/proceedings/3119.pdf

URL: http://foafing-the-music.iua.upf.edu
Foafing the Music

System Description:
 Returns
personalized music recommendations based on
a user’s profile (listening habits, location)
 Bases recommendation information on info gathered
across the web
 Similarity
between artists determined by their relationships
between one another (ex: influences, followers)
 Creates
RSS feed for news related to favorite artists
 Computes musical similarity between specific songs
Meledex/Greenstone


Developers: McNab, Smith, Bainbridge, Witten
Relevant Publications:
 McNab,
Smith, Bainbridge, Witten, The New Zealand
digital library MELody inDEX, D-Lib Magazine, May
1997

URL: http://www.nzdl.org/fast-cgibin/music/musiclibrary
Meldex/Greenstone

System Description:
 Receives
audio queries (hummed, sung, or played
audio)
 Filters audio to get fundamental frequency
 Input sent to pitch tracker, which returns average pitch
estimate for each 20ms
 Note duration can optionally be taken into account, as
well as user defined tuning
 Results found using approximate string matching based
on melodic contour
Musipedia/Melodyhound/Tuneserver


Developer: Rainer Typke
Relevant Publications:
 Prechelt,
L., Typke, R., An Interface for Melody Input.
ACM Transactions on Computer-Human Interaction, June
2001

URL: http://www.musipedia.org
Musipedia/Melodyhound/Tuneserver

System Description
 Query
by humming system
 Record sound, system converts into sound wave
 Converts query sound wave into Parson’s Code
 Match by melodic contour
 Determine
distance between query and possible results via
editing distance (calculate the number of modifications
necessary to turn one string into the other)
 Return results with smallest distance
MIDIZ


Developers: Maria Cláudia Reis Cavalcanti, Marcelo
Trannin Machado , Alessandro de Almeida Castro
Cerqueira, Nelson Sampaio Araujo Júnior and Geraldo
Xexéo
Relevant Publications:

Cavalcanti, Maria Cláudia Reis et al. MIDIZ: content based
indexing and retrieving MIDI files. J. Braz. Comp. Soc.
[online]. 1999, vol. 6, no. 22008-11-02].
http://www.scielo.br/scielo.php?script=sci_arttext&pid=S0
104-65001999000300002&lng=&nrm=iso ISSN 01046500. doi: 10.1590/S0104-65001999000300002
MIDIZ

System description




Database that stores, indexes, searches for and recovers MIDI files based on the
description of a short musical passage
Allows for non-exact queries
Musical sequence is based on intervals between notes
Uses wavelet transform and a sliding window in the melody







Window defines a note sequence of a given size (2^k) and moves through the song note by note
Each sequence in the window is converted into a vector storing the interval distances
First note in a sequence is assigned the value 1
Values of the following notes are determined by their chromatic distance in relation to the
first note
Those values are added together in pairs, and the result is converted into coordinates in
the final vector
Songs in database are stored in a BD Tree, determined by Discriminator Zone Expression
Completed vector of query is submitted to tree, similar results are returned
Mu-seek



Developers: Darfforst LLP
URL: http://www.mu-seek.com/
System Description:
 Search
by title, lyrics, tune fragment, or MIDI
 Uses pitch, contour, and rhythm to find matches
MusicSurfer


Developers: Music Technology Group of the
Universitat Pompeu Fabra
Relevant Publications:
 Cano
et al. An Industrial-Strength Content-based Music
Recommendation System, Proceedings of 28th Annual
International ACM SIGIR Conference; Salvador, Brazil
2005. http://mtg.upf.edu/files/publications/3ac0d3SIGIR05-pcano.pdf

URL: http://musicsurfer.iua.upf.edu/
MusicSurfer

System Description:
 Automatically
extracts features from songs in database
based on rhythm, instrumentation, and harmony
 Uses
 Uses
spectral analysis to determine timbre
those features to search for similar songs
NameMyTune



Developers: Strongtooth, Inc
URL: http://www.namemytune.com/
System Description:
 User
hums query into microphone
 Results are found by other users determining what the
song is
Orpheus


Developers: Rainer Typke
Relevant Publications:
 Typke,
Giannopoulos, Veltkamp, Wiering, van Oostrum,
Using Transportation Distances for Measuring Melodic
Similarity, ISMIR 2003

URL: http://teuge.labs.cs.uu.nl/Ruu/?id=5
Orpheus

System Description:
 Query
can be example from database, hummed or
whistled melody, or a MIDI file
 All queries are converted into internal database format
before submission
 Similarity between query and results based on Earth
Mover’s Distance
Two distributions are represented by signatures
 Distance represents the amount of “work” required to
change one signature to the other
 Work = user defined distance between two signatures

Probabilistic “Name That Song”


Developers: Eric Brochu and Nando de Freitas
Publications:
 Brochu,
E., Freitas, N.D., "Name That Song!": A
Probabilistic Approach to Querying on Music and Text.
NIPS. Neural Information Processing Systems: Natural
and Synthetic 2002 (2003)
Probabilistic “Name That Song”

System Description:
 Query
is composed of note transitions (Qm) and words
(Qt). A match is found when a corresponding song has
all elements of Qm and Qt with a frequency of 1 or
greater.
 Database songs are clustered. Query is performed on
each song in each cluster until a match is found
Query by Humming (Ghias et al.)


Developers: Asif Ghias, Jonathan Logan, David
Chamberlin, Brian C. Smith
Relevant Publications:
 Ghias,
A., Logan, J., Chamberlin, D., Smith, B.C., Query
by Humming - Musical Information Retrieval in an Audio
Database, ACM Multimedia (1995)

URL:
http://www.cs.cornell.edu/Info/Faculty/bsmith/quer
y-by-humming.html
Query by Humming (Ghias et al.)

System Description:
 Hummed
queries are recorded in Matlab
 Pitch tracking is performed
 Converted
into a string of intervals similar to Parson’s Code
(U/D/S used as characters instead of R/D/U)
 Baesa-Yates/Perleberg
pattern matching algorithm
used to find pattern matches
 Find
all instances of the query string in the result string with
at most k mismatches
 Results returned in order of how they best fit the query
Search by Humming


Developers: Steven Blackburn
Relevant Publications:







Blackburn, S. G., Content Based Retrieval and Navigation of Music Using Melodic
Pitch Contours. PhD Thesis, 2000
Blackburn, S. G., Content Based Retrieval and Navigation of Music. Masters,
1999
DeRoure, D., El-Beltagy, S., Blackburn, S. and Hall, W., A Multiagent System for
Content Based Navigation of Music. ACM Multimedia 1999 Proceedings Part 2,
pages 63-6.
Blackburn, S. G. and DeRoure, D. C., A tool for content based navigation of
music. Proceedings of ACM Multimedia 1998, pages 361—368
DeRoure, D. C. and Blackburn, S. G., Amphion: Open Hypermedia Applied to
Temporal Media, Wiil, U. K., Eds. Proceedings of the 4th Open Hypermedia
Workshop, 1998, pages 27--32.
DeRoure, D. C., Blackburn, S. G., Oades, L. R., Read, J. N. and Ridgway, N.,
Applying Open Hypermedia to Audio, Proceedings of ACM Hypertext 1998,
pages 285--286.
URL: http://www.beeka.org/research.html
Search by Humming

System Description:
Takes query by humming, example, or MIDI
 Queries and database contents represented by gross
melodic pitch contour


Within database, each track is stored as a set of overlapping
sub-contours of a constant length
Distance between songs is determined by the minimum cost
of transforming one contour into another (similar to EMD)
 Query is expanded into a set of all possible contours of the
same length as the database’s sub-contours
 A score is calculated for each file based on the number of
times a contour in the expanded query set occurs in the file.
Results are sorted in order of score

SOMeJB (The SOM enhanced JukeBox)


Developers: Andreas Rauber, Markus Frühwirth, E. Pampalk, D. Merkl
Relevant Publications:






A. Rauber, E. Pampalk, D. Merkl, The SOM-enhanced JukeBox: Organization and
Visualization of Music Collections based on Perceptual Models, Journal of New Music
Research (JNMR), Swets and Zeitlinger, 2003
E. Pampalk, A. Rauber, D. Merkl, Content-based Organization and Visualization of Music
Archives In: Proceedings of ACM Multimedia 2002, pp. 570-579, December 1-6, 2002,
Juan-les-Pins, France
A. Rauber, E. Pampalk, D. Merkl, Using Psycho-Acoustic Models and Self-Organizing Maps
to Create a Hierarchical Structuring of Music by Musical Styles, Proceedings of the 3rd
International Symposium on Music Information Retrieval (ISMIR 2002), pp. 71-80, October
13-17, 2002, Paris, France.
A. Rauber, E. Pampalk, D. Merkl, Content-based Music Indexing and Organization,
Proceedings of the 25. Annual International ACM SIGIR Conference on Research and
Development in Information Retrieval (SIGIR 02), pp. 409-410, August 11-15, 2002, in
Tampere, Finland
A. Rauber, and M. Frühwirth, Automatically Analyzing and Organizing Music Archives,
Proceedings of the 5. European Conference on Research and Advanced Technology for
Digital Libraries (ECDL 2001), Sept. 4-8 2001, Darmstadt
URL: http://www.ifs.tuwien.ac.at/~andi/somejb
SOMeJB (The SOM enhanced JukeBox)

System Description:
 Interface
is a static, web-based map where similar
pieces of music are clustered together
 Music organized by a novel set of features based on
rhythm patterns in a set of frequency bands and
psycho-acoustically motivated transformations
 Extracts
features that apply to loudness sensation (intensity),
and rhythm
 Self-organizing
map algorithm is applied to organize
the pieces on a map (trained neural network)
SoundCompass


Developers: Naoko Kosugi, Yuichi Nishihara, Tetsuo Sakata,
Masashi Yamamuro and Kazuhiko Kushima, NTT Laboratories
System Description:


User sets a metronome and hums melody in time with clicks
Database songs have three feature vectors




Tone Transition Feature Vector: contains the dominant pitch for each
16-beat window
Partial Tone Transition Feature Vector: Covers a time window different
from the Tone Transition Feature Vector
Tone Distribution Feature Vector: histogram containing note
distribution
Query is matched against each of the vectors, results are
combined by determining the minimum
Tararira


Developers: Ernesto López, Martin Rocamora,
Gonzalo Sosa
Relevant Publications:
 E.
Lopez y M. Rocamora. Tararira: Sistema de búsqueda
de música por melodía cantada. X Brazilian Symposium
on Computer Music. October, 2005.

URL:
http://iie.fing.edu.uy/investigacion/grupos/gmm/p
royectos/tararira/
Tararira

System Description:
 User
submits a hummed query
 Pitch tracking applied to query
 Audio segmentation determines note boundaries
 Melodic analysis adjusts pitches to tempered scale
 Results found by coding query note sequence, find
occurrences using flexible similarity rules (string
matching), and refining the selection using pitch time
series
TreeQ


Developer: Jonathan Foote
Publications:
 Foote,
J.T., Content-Based Retrieval of Music and Audio,
C.-C. J. Kuo et al., editor, Multimedia Storage and
Archiving Systems II, Proc. of SPIE, Vol. 3229, pp. 138147, 1997

URL: http://sourceforge.net/projects/treeq/
TreeQ

System Description:


Primarily query by example, can also search by classification
Tree based supervised vector quantizer is built from labeled
training data


Database audio is parameterized via conversion into MFCC and
energy vectors
Each resulting vector is quantized into the tree



Vector space divided into “bins”, any MFCC vector will fall into one bin
A histogram based on the distribution of MFCC vectors into each bin is
created for query and database audio
Songs matched based on histograms of feature counts at tree
leaves


Distance is determined using Euclidian distance between
corresponding templates of each audio clip
Results sorted by magnitude and returned as a ranked list
VisiTunes




Developers: Scott McCaulay
Further Information:
www.slis.indiana.edu/research/phd_forum/2006/mcca
ulay.doc
URL: http://www.naivesoft.com/
System Description:

Uses audio content of songs to calculate similarity between
music and creates playlists based on the results
Converts sample values of each frame to frequency data
 Extracts sum total of sound energy by frequency band
 Uses results to simplify audio data into 256 integer values for fast
comparison
