Is it a bass track?

download report

Transcript Is it a bass track?

BASS TRACK SELECTION IN MIDI FILES
AND MULTIMODAL
IMPLICATIONS TO MELODY
Octavio Vicente & Jose M. Iñesta
gPRAI
Pattern Recognition and Artificial Intelligence Group
Computer Music Laboratory
PROJECT
Description and Retrieval of Music and Sound Information
Descripción y Recuperación de Información Musical y Sonora
Spain
introduction
CONTEXT DOMAIN:
• Multimedia content management
• Content-based music information retrieval
DATA:
• Symbolic music container files (digital scores)
• Multi-track MIDI files
• Organized by instruments or parts in tracks
• Some tracks have particular useful information
introduction
WHAT KIND OF USEFUL INFORMATIONS?
• MELODY TRACK
–
–
–
–
Melody is what we use to remember of a song
Music repository indexing (music thumbnails)
Fingerprinting
Music similarity and retrieval
• BASS TRACK
– Bass is an important feature in music structure
– Harmonic analysis
– Rhythm analysis
introduction
THE PROBLEM:
• Bass track selection in multi-track MIDI files
using our background in melody track selection
(D. Rizo et al. “A Pattern Recognition Approach for Melody Track Selection in MIDI Files”. ISMIR 2006)
introduction
WHAT CAN WE A PRIORI EXPECT:
• MELODY TRACK
– The concept of melody is somehow elusive:
• Something singable
• Something catchy in a song
• A monophonic part easy to remember
• BASS TRACK
– Seems easier at first:
• Low pitches involved
• Monophonic, melodic, etc.
• No a priori assumtions about instrumentation
track description
• Statistical features are extracted from every track
• A feature vector
represents each track
• Most descriptors
include normalized versions
Table 1: Normalized and non normalized features used for
track content description.
About the
track
Category
Track
information
Pitch
About its
content
ry and classifiers im-
xtracted. Those red for building up a
his dictionary, a set
Pitch
intervals
Note
durations
Normalized
Avg Polyphony
Duration
Occupation
Occupation Rate
Number of Notes
Highest
Lowest
Mean
Standard Deviation
Largest
Smallest
Mean
Standard Deviation
Longest
Shortest
Mean
Standard Deviation
Non normalized
Avg Polyphony
Occupation Rate
Highest
Lowest
Mean
Standard Deviation
Largest
Smallest
Mean
Standard Deviation
Longest
Shortest
Mean
Standard Deviation
It is non sense to use non normalized versions of
using max and min value
for all the tracks in the file
track description
• Some descriptors prove to be useful:
Average polyphony
No
Yes
Lowest pitch
No
Yes
Normalized no. notes
No
Yes
Is it a bass track?
The combination of these hints will permit us to assign
each track a probability
of being a bass track.
track description
• The tool for giving the probability is a Random
Forest Classifier (RFC)
• due to their ability for making their own feature
selection
– Using K trees, each Tj gives its decision dj on t
– then
– where
(“purity”) ratio between the number of samples
of the winning class for the decision leaf
(Breiman, L. (2001). “Random forests”. Machine learning, 45(1): 5–32 )
experimental setup
• Three data sets used (200 files each):
– CL200: classical music
– JZ200: jazz
– KR200: pop-rock (karaoke)
• Number of bass tracks per file:
The system should say NO-TRACK
The system should select it
The system should select any of them
• Number of bass and non-bass tracks in the MIDI datasets:
experimental setup
Tag compilation
and selection
Track labels
Bass labels
MIDI
files
Bass tags
dictionary
Bass tags
Bass
tracks
Genre
tags
Classical
Jazz
Pop-rock
RFCs
Classifier
Classical
Classifier
Jazz
Classifier
Pop-rock
Experiment 1
Bass versus non-bass classification:
given a particular track, is it a bass one?
?
Experiment 1
Bass versus non-bass classification:
given a particular track, is it a bass one?
Experiment 2
Bass track selection:
given a file, which track contains the bass part?
None
1
2
3
4
…
N
?
Notation:
0
For solving the no bass track situation:
Experiment 2
Bass track selection:
given a file, which track contains the bass part?
In addition to accuracy, other evaluations are computed:
• FP : the classifier selects a non-bass track
• TP : the selected track contains the correct bass line
• FN : no track selected but the MIDI file indeed contains at least one
bass track
Experiment 2
Bass track selection:
given a file, which track contains the bass part?
Experiment 2’
Bass track selection across styles:
style specificities of the bass part
The test style files were not used for training
Piano left hand issue
(82.8%)
Experiment 3
A question of multimodal nature arises:
Can we use the bass track information for improving
melody track detection?
• Melody tracks classification is based on the corresponding
estimated from the
provided by the random forest using
the melody tagged data, using
Constraint:
A first naïve approach could be 1st estimate
and remove that
track for
selection.
• PRO: it simplifies the problem (less tracks)
• CON: no new information is provided
Experiment 3
The proposed approach:
instead of looking at
, let’s consider the probabilities of
being a melody conditioned also by the knowledge of how a bass
looks like and the different-track constraint
If we also assume that bass and melody tracks are not mutually
conditioned, we reach to
and
Experiment 3
Results:
Multimodal bass track selection:
Multimodal melody track selection:
Conclusions
• Global statistical features and RFC have proven to be
useful for other kind of tracks other than melody.
• In fact, it works better (+24.4 %) for bass than for
melody (seems to be easier).
• Bass track characterization depends on the music genre.
• Using bass information improved significantly the
melody track selection.
• The improvement was lower when melody was used to
select bass tracks.
and future works
• Generalization studies are needed
– conditioned by the long and tedious work of tagging and
checking the ground truth in hundreds of MIDIs
• Natural extension to other tracks:
– instrument-based: piano, for example, but any;
– role-based: solos, intros, etc.
– Study *multi*modal interactions among them
BASS TRACK SELECTION IN MIDI FILES
AND MULTIMODAL
IMPLICATIONS TO MELODY
Octavio Vicente & Jose M. Iñesta
gPRAI
Pattern Recognition and Artificial Intelligence Group
Computer Music Laboratory
PROJECT
Description and Retrieval of Music and Sound Information
Descripción y Recuperación de Información Musical y Sonora
Spain