ISE 575 Paper Review Presentation Music Generation form
Download
Report
Transcript ISE 575 Paper Review Presentation Music Generation form
ISE 575 Paper Review Presentation
Music Generation from Statistical Models
Author : Darrell Conklin
Reviewer : ChangHyun Kim
Date : February 8th 2007
1
Before Start the presentation…
What is statistical model in music?
Why statistical model?
That is using mathematics or statistics in music analysis and
music generation.
With statistical model, music novices such as
engineers(except Prof. Elaine, lab friends and here students),
scientists can compose and perform a plausible music piece.
Then is it only for music novices? How about music
professionals? Do they also like it?
???
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
Index
1. Abstraction
2. Introduction
3. Statistical models of music
3.1. Context models
3.2. Complex statistical models
4. Generation of music from statistical models
4.1. Random walk method
4.2. Hidden Markov models and Viterbi decoding
4.3. Stochastic sampling
4.4. Pattern-based sampling
5. Conclusion
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
1. Abstraction
Statistical models creates a new pieces from a extant pieces.
Music Generation problem is sampling from extant pieces by the
statistical models.
No distinction between analytic and synthetic models of music
Presents several methods for sampling and proposes a new
approach that maintains the “intra opus” pattern repetition within an
extant pieces
A major component of creativity is the adaptation of extant art works,
and this is also an efficient way to sample pieces from complex
statistical models.
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
2. Introduction
There were many tries before for modeling robust, creative, styleindependent empirical learning methods of musical style imitation.
Historically, linguistics used statistics more for language analysis.
However, music uses statistics for generating music from the first.
That causes subjective evaluation problem.
Consequently, music researchers turned their attention in music
analysis, music prediction, phrase structure analysis, music
classification.
In music, analytic and synthetic models does not differ.
For example, successive speech recognition made a small dedicated
group of natural language generation researchers.
Analytic statistical models can guide the generation process by
evaluating candidate generations and ruling out those with low
probability.
Surprisingly! Only a few statistics have been used.
Introduction to Section 2 and 3
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
3. Statistical models of music
A piece of music is a sequence of events, notes, duration and onset
time, etc…
A statistical model captures regularities in a class of music, such as
a genre, a style, a composer’s style, or otherwise.
For classify the extant piece according to model, conditional probabil
ity is used
P(p|mi), where p is a piece and mi is a model.
In Bayesian classification framework, there are several statistical mo
dels mi, each of a different class i.
Statistical models of music are created empirically by induction.
A corpus of training pieces in a class is used to instantiate the param
eters of a statistical model.
For example, I have a music genre classification model. That means this model c
ould use the key, chord progression, beat change, as categorizing parameters.
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
3.1. Context models
The most prevalent type of statistical model
Markov, Hidden Markov, n-gram, finite state models
Why is the context model prevalent?
Events are predictable
Easy to induce from the examples: such as a suffix tree
Very fast -> applicable to real-time algorithmic composition system
Easily compute
Straight forward to generate new music
Shortcomings : limited training corpora and few sequences of events in “
inter opus”
Problem, “Limited training corpora “
In analytic aspects, new music can not be classified into one specific cla
ss
In synthetic aspects, few sequences of events will be encountered inter
opus.
Sparse data problem
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
3.2. Complex statistical models
Musical events(notes) are originated from background kn
owledge. We call this generation, “viewpoints.”
Viewpoint is one of history-based models.
The technique of viewpoints
Any features computable from preceding events can be used to c
ondition the probability of the current event.
The current event can be predicted by an interpolation of two sep
arate models, called short-term and long-term models.
What is the relation between above two predictions for current ev
ent? Any idea?
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
Continued…
More powerful grammars, such as context-free grammar
s, can have a statistical models.
A type of context free grammar, called the dependency g
rammar, may have promise for music.
For example, this type of model elegantly capture dep
endencies between chord and non-chord tones in ton
al music.
Even though its great capabilities for complex music,
some music features requires even beyond the
context-free grammars.
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
4. Generation of music from statistical models
Music Analysis is prior step of Music Generation.
Goal for analytic statistical model : Assigning a high prob
abilities to pieces in the class, and lower probabilities to
all other pieces
In the Bayesian framework, given multiple class models,
a piece is classified by the model which assigns it the
highest probability.
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
4.1. Random Walk
History based model -> Complex
statistical models
Process
Flawed for generating complete p
ieces, because it is greedy and c
annot guarantee that pieces with
high overall probability will be pro
duced.
What is the meaning of “greedy”?
Low probability events(or sequen
ces) are possible because it is ba
sed on complex statistical model.
Is duration exceeded?
Yes
No
Generating
random events(numbers)
Add this event
to the current piece
Figure. Process Diagram
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
For the more about Random Walk
Assume a model
Individual
walking on a straight line who can at
each point of time either takes one step to the
right with probability p or one step to the left with
probability 1-p
Starting point : 0 (origin)
Distance between one point and next is a constant
The direction from one point to the next is chosen
at random, and no direction is more probable than
another
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
Example
“Myrw2d” function in MATLAB
% Myrw2d (numSteps, plotInterval)
simulates
a two-dimensional symmetric random w
alk(equal probability of going up, down, left or righ
t)
numSteps is the number of steps.
The probability distribution (the number of times th
e same point is visited) is plotted over the interval
[-plotInterval, plotInterval].
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
Simulation
Algorithm
Move East=wavread('hihat1.wav');
Move West=wavread('kick2.wav');
Move North=wavread('snare1.wav');
Move South=wavread('hihat2.wav');
Rw(100,20)
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
4.2. Hidden Markov Model and Viterbi Decoding
A statistical model in which observed events are generated
from underlying hidden states.
State transition according to probability
Different state sequences have different probabilities.
Decoding step -> Viterbi dynamic programming algorithm
Viterbi decoding produces the most probable underlying se
quence of harmonic symbols for a given melody line.
Viterbi decoding drawback : computation time increases ex
ponentially with the context length of the hidden Markov
model.
Chorale K11 (Bach's harmonisation), Moray Alan
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
For the more about Hidden
Markov model
State transitions in a hidden
Markov model
(example)
X – hidden states
Y—observable outputs
A –transition probabilities
B – output probabilities
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
Example
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
Applications of hidden Markov
models
Speech recognition, gesture and body motion
recognition, optical character recognition
Machine translation
Bioinformatics and genomics
Prediction of protein-coding regions in genome sequences
Modeling families of related DNA or protein sequences
Prediction of secondary structure elements from protein
primary sequences
Musical Impromtuz system such as OMAX
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
OMAX interaction with a human
performer
OMax is based on a statistical model related to
Variable Memory Markov
Models, a variant of markov where the markovian
order is not fixed, but
rathered "guessed" by the system and it is adaptive,
i.e. it changes
locally depending on what is learned.
Recordings of a concert at USC in Los Angeles on
april 4, 2006, featuring Dennis Thurmond, pianist,
and G. Assayag playing OMax.
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
4.3. Stochastic sampling
Gibbs sampling
From a model m, we start with some initial piece p.
From next step, iteration starts.
Generate random event of the piece.
Substitute this event if valid into the position.
Then make a new piece called p’, having the P(p’|m).
Go back to the iteration start point.
Problem : Slow because of the term P(p’|m) calculation time
Metropolis sampling
A random event in the piece p is chosen, and a single event is su
bstituted into that position, producing a new piece p’, if P(p’|m) >
P(p|m), and otherwise rejected with a rejection probability that in
creases with each iteration
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
Continued…
Drawback : Local valleys, where valid substitutions at all locatio
ns cannot improve the probability of the current piece. So,
good local solutions might be the enemies of good
global solutions. In this sense the statistical model
have greediness.
Advantage : modifications to extant pieces in a style are very effi
cient ways to generate high probability pieces in a style.
Two challenges
Motif or phrase-level substitutions required, which raises the
computation time.
Musical cohesion is not guaranteed.
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
For the more about Gibbssampling
Algorithm to generate a sequence of
samples from the joint probability
distribution of two or more random
variables
Special case of the Metropolis-Hastings
algorithm
My
last homework
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
Simulation
Objectives : A-major tonal music with gibbs
sampling
Gibbs sampling from a bivariate normal distri
bution
Mapping two variables to control amplitude
and frequency of pure cosine wave
Two random number generator :
gibbssample(ii,1)=y_1+rho*(gibbssample(ii-1,2)-y_2)+sqrt(1-rho^2)*randn(1,1);
gibbssample(ii,2)=y_2+rho*(gibbssample(ii,1)-y_1)+sqrt(1-rho^2)*randn(1,1);
How keys are mapped to random event?
Random
number
interval
-2~-1.5
-1.5~1.0
-1.0~0.5
-0.5~0
0~0.5
0.5~1.0
1.0~1.5
1.5~2.0
Outside
1ST
A4
B
C#
D
E
F#
G#
A5
White Noise
2ND
D
B
C#
A4
A5
F#
G#
E
White Noise
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
4.4. Pattern-based sampling
The “intra opus” repetition with a complex statistical mod
el
Shortcoming - While pattern continuation can easily be h
andled, there is no way to specify at the outset of genera
tion where repeated patterns should begin and end.
Pattern-based sampling sequence : apply a pattern disco
very algorithm, then conserve pattern structure
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim
5. Conclusion
Analytic statistical model categorize extant pieces into specific musi
c style according to the probability
Random walk is a history based model.
Deep knowledge required HMM and Viterbi decoding.
Stochastic sampling uses extant pieces for reducing search space a
nd more rapidly focus on high probability pieces.
Pattern discovery algorithm, revealing repetition structure and music
al cohesion to new productions.
Author said, “A major component of creativity is the adaptation of ext
ant art works, and this is also an efficient way to generate music fro
m complex statistical models.”
Darrel Conklin’s: Music Generation from Statistical Models
Presented in ISE575 by ChangHyun Kim