Eustace06Project_presentation

Transcript Eustace06Project_presentation

Classification of melody by composer using hidden
Markov models
Greg Eustace
MUMT 614: Music Information Acquisition,
Preservation, and Retrieval
Overview
•
•
•
•
•
•
•
Project description
Software and dataset
Representations of melody
HMM parameters
The learning process
Training and testing HMMs
Summary
Project description
•
The goal of this project is to use Hidden Markov Models (HMM) for
the automatic classification of symbolic melodic data by composer.
•
Research questions:
•
•
•
Are there significant statistical differences between melodies written by
different composers?
How do different representations of melody affect the performance of
the classifier?
How do different types of HMMs affect the performance?
Representations of melody
•
C major scale:
1. Absolute pitch (normalised to the octave).
(e.g. 1, 3, 5, 6, 8, 10, 11, 12)
2. Absolute pitch with rhythm
(e.g. 1, 1, 3, 5, 6, 8, 10, 10, 11, 11, 12, 12)
• The note number is given once if equal to a quarter note, twice
for a half note, etc. (Chai, and Vercoe 2001).
3. Interval
(e.g. 2, 2, 1, 2, 2, 2, 1)
Representations of melody
4. Melodic Contour
(e.g. 2, 2, 2, 2, 2, 2, 2)
• The convention is that intervals of a unison = 1, intervals of second = 2
and all other intervals are 3.
5. Alternative contour representations?
Software and dataset
• Extraction of melodic data uses ad hoc algorithms developed in
MAX/MSP and MATLAB.
• Classification uses the HMM Toolbox by Kevin Murphy.
http://www.cs.ubc.ca/~murphyk/Software/HMM/hmm.html
• A large collection of links to MIDI files has been compiled by Cory
McKay:
http://www.music.mcgill.ca/~cmckay/midi.html
• The melodic data is the soprano line extracted from a type 1 MIDI
file.
• It is hoped that composers of contrasting style will show the greatest
statistical differences.
Hidden Markov models
• “A Hidden Markov model (HMM) is a doubly imbedded stochastic
process with an underlying stochastic process that is not observable
(i.e. is hidden) but can only be observed through another set of
stochastic processes that produce the sequence of observations”
(Rabiner 1989).
• An HMM is defined by the parameters:
• M = number of distinct observation symbols (e.g. for absolute pitch these
are the numbers 1-13 corresponding to the 12 notes of the octave).
• N = number of states in the model (these may not have physical
significance).
• A = {aij} = state transition probability distribution
• B = {bj(k)} = observation symbol probability distribution
• π = {πi} = the initial state distribution
• The set of hidden model parameters are given by λ = (A, B, π).
The learning process
• The Baum-Welch learning algorithm is used to find the hidden
parameters (λ) of the HMM .
• This process uses maximum likelihood parameter estimation. In
general, the likelihood is maximized when a given test sequence
corresponds to a specific model. It is also common to attempt to
maximize the logarithm of the likelihood.
Training and testing HMMs
• The training procedure corresponds to one of the three fundamental
problems associated with HMMs as defined by Rabiner (1989). That
is, for a given observation sequence O = {O1, O2,… On} and a
model with parameters λ = (A, B, π) what is the value of λ that
maximizes the probability of the observation sequence P(O| λ)?
• The first fundamental problem is essential to the classification of an
unknown sequence. Given an observation sequence O = {O1, O2,…
On} and a model λ = (A, B, π) how do we efficiently compute P(O|λ)
(Rabiner 1989)?
Training and testing HMMs
The process can be summarised as:
• Train a different model on data from each composer.
• Once the models have been trained compute the log-likelihood for a
test sequence for each model.
• Classify the training data according to the model which gives the
highest value for the log-likelihood.
• Repeat the process for all representations of melody
Training and testing HMMs
•
•
Using different types of HMMs may affect classification
Fully connected (ergodic) models: every state is connected to every other
state.
•
Left-right (Bakis) models: states are connected to themselves and to the
adjacent state (proceeding from left to right). These are typically used for
modelling time varying signals.
Summary
• A summary of variables which may affect classification:
•
•
•
•
•
•
•
•
Composer
Size of the dataset
Nature of the pieces used
MIDI transcription
Melodic extraction
Melodic representation
Type of HMM
Number of model states
References
• Chai, W., and B. Vercoe. Folk music classification using hidden
Markov models. 2001. Proceedings of the international conference
on artificial intelligence.
• Rabiner, L. 1989. A tutorial on hidden Markov models and selected
applications in speech recognition. Proceedings of the IEEE 77:
257–86.

Eustace06Project_presentation

Transcript Eustace06Project_presentation

Directory