Presentation

Download Report

Transcript Presentation

MSc Project
Musical Instrument Identification System
MIIS
Xiang LI
ee05m216
Supervisor: Mark Plumbley
Motivation of MIIS
 Musical instrument identification plays an important role in
musical signal indexing and database retrieval.
 People can search music by the musical instruments
instead of the type or the author
 For instance, user is able to query ‘find piano solo parts of
a musical database’.
Introduction
Identification
results
Bass
Drum
Piano
Musical Mixtures
Saxophone
Musical instruments
Structure of MIIS
Estimated
Sources
Input Mixture
X(n)
DUET algorithm
Separation
s1
s2
s3
Feature
Extraction
Classification

sn
Functional Components
Classification
Results
DUET algorithm:
Separate the input musical mixture into sources
Feature Extraction:
Extract features of each source
Classification:
Implement classifier on testing source and find out the class it belongs to
DUET algorithm
 Time-Frequency representation:
~
x1 and ~
x2
are representations in time-frequency domain, i.e.
Short-time Fourier Transform, Modified Cosine Discrete Transform.
 Mixing parameters computation:
aj 
~
x2
~
x
1
Time-frequency points are labeled with a j
 Mask construction:
Mask M j  1 j equals deciding set  j ,which could be
achieved by grouping the time-frequency point with the same label
 Source estimation
~
sj
sˆ j  M j ~
x1
is the time-frequency representation of one source.
 Time-domain conversion
~
Convert each s to s in time domain
j
j
Feature Extraction
 Mel-Frequency Cepstral Coefficient (MFCC)
Relationship between Mel and Hertz Mel ( f )  2595 log 10 (1  f )
 Spectral Rolloff
700
It is calculated by summing up the power spectrum samples until the desired
percentage (threshold) of the total energy is reached.
 Bandwidth
Defined as the width of the range of frequencies that the signal occupies.
 Root Mean Square
RMS features are used to detect boundaries among musical instruments
 Spectral Centroid
Correlates strongly with the subjective qualities of “brightness” or
“sharpness”.
 Zero Crossing Rate
A simple measure of the frequency content of a signal
Classification
 K-Nearest Neighbor


Nonparametric classifier
Large storage required
y
Class a
Class c
X
Class b
x
Experiments
Musical Instruments Database
Database : Downloaded from University of Iowa website.
Mixtures are composed by isolated notes.
Training set: Includes 18 classes musical instruments
Testing set: Choose 3 to 5 instruments to generate mixtures
The instruments to be tested:
Alto Saxophone
Bassoon
Double Bass
Flute
Viola
Experiments of three groups
For each group, five mixtures are tested and the result of each group is
listed as follows:
No. of Sources
Percentage
correct
Group 1
3
80%
Group 2
4
60%
Group 3
5
48%
Example
Original Sources
Estimated Sources
Source
SDR
Original
Source
Estimated
Source
Result
AltoSaxophone.C4B4
17.4453
2
2
correct
Bassoon.C3B3
10.4249
Double Bass.D2B2
6.0127
9
4
9
4
correct
correct
Results discussion



Without MISS, the recognisation percentage of each
source in 18 classes is 1/18 which is about 5.5%.
The worst case in our experiments is group 3 where
each mixture consists five sources. The percentage is
48%.
The less sources mixtures have, the higher percentage
system performs. More sources introduce more
interferences among each other.
Conclusion
 MISS is a system able to identify each musical instrument in a
musical mixture.
 Three functional components are introduced:
 DUET algorithm
 Feature Extraction
 Classification
 Experiments of three groups, which is fifteen mixtures in total have
been tested. Correct percentages are 80%,60%and 48%
respectively.
 More features could be extracted such as features of MPEG7
A more adaptive mask could help overcoming interferences among
sources.