Presentation
Download
Report
Transcript Presentation
MSc Project
Musical Instrument Identification System
MIIS
Xiang LI
ee05m216
Supervisor: Mark Plumbley
Motivation of MIIS
Musical instrument identification plays an important role in
musical signal indexing and database retrieval.
People can search music by the musical instruments
instead of the type or the author
For instance, user is able to query ‘find piano solo parts of
a musical database’.
Introduction
Identification
results
Bass
Drum
Piano
Musical Mixtures
Saxophone
Musical instruments
Structure of MIIS
Estimated
Sources
Input Mixture
X(n)
DUET algorithm
Separation
s1
s2
s3
Feature
Extraction
Classification
sn
Functional Components
Classification
Results
DUET algorithm:
Separate the input musical mixture into sources
Feature Extraction:
Extract features of each source
Classification:
Implement classifier on testing source and find out the class it belongs to
DUET algorithm
Time-Frequency representation:
~
x1 and ~
x2
are representations in time-frequency domain, i.e.
Short-time Fourier Transform, Modified Cosine Discrete Transform.
Mixing parameters computation:
aj
~
x2
~
x
1
Time-frequency points are labeled with a j
Mask construction:
Mask M j 1 j equals deciding set j ,which could be
achieved by grouping the time-frequency point with the same label
Source estimation
~
sj
sˆ j M j ~
x1
is the time-frequency representation of one source.
Time-domain conversion
~
Convert each s to s in time domain
j
j
Feature Extraction
Mel-Frequency Cepstral Coefficient (MFCC)
Relationship between Mel and Hertz Mel ( f ) 2595 log 10 (1 f )
Spectral Rolloff
700
It is calculated by summing up the power spectrum samples until the desired
percentage (threshold) of the total energy is reached.
Bandwidth
Defined as the width of the range of frequencies that the signal occupies.
Root Mean Square
RMS features are used to detect boundaries among musical instruments
Spectral Centroid
Correlates strongly with the subjective qualities of “brightness” or
“sharpness”.
Zero Crossing Rate
A simple measure of the frequency content of a signal
Classification
K-Nearest Neighbor
Nonparametric classifier
Large storage required
y
Class a
Class c
X
Class b
x
Experiments
Musical Instruments Database
Database : Downloaded from University of Iowa website.
Mixtures are composed by isolated notes.
Training set: Includes 18 classes musical instruments
Testing set: Choose 3 to 5 instruments to generate mixtures
The instruments to be tested:
Alto Saxophone
Bassoon
Double Bass
Flute
Viola
Experiments of three groups
For each group, five mixtures are tested and the result of each group is
listed as follows:
No. of Sources
Percentage
correct
Group 1
3
80%
Group 2
4
60%
Group 3
5
48%
Example
Original Sources
Estimated Sources
Source
SDR
Original
Source
Estimated
Source
Result
AltoSaxophone.C4B4
17.4453
2
2
correct
Bassoon.C3B3
10.4249
Double Bass.D2B2
6.0127
9
4
9
4
correct
correct
Results discussion
Without MISS, the recognisation percentage of each
source in 18 classes is 1/18 which is about 5.5%.
The worst case in our experiments is group 3 where
each mixture consists five sources. The percentage is
48%.
The less sources mixtures have, the higher percentage
system performs. More sources introduce more
interferences among each other.
Conclusion
MISS is a system able to identify each musical instrument in a
musical mixture.
Three functional components are introduced:
DUET algorithm
Feature Extraction
Classification
Experiments of three groups, which is fifteen mixtures in total have
been tested. Correct percentages are 80%,60%and 48%
respectively.
More features could be extracted such as features of MPEG7
A more adaptive mask could help overcoming interferences among
sources.