Predicting Popular Music using data mining and machine learning
Download
Report
Transcript Predicting Popular Music using data mining and machine learning
MACHINE LEARNING
TECHNIQUES FOR
MUSIC PREDICTION
S. Grant Lowe
Advisor: Prof. Nick Webb
RESEARCH QUESTIONS
Can we predict the year in which a
song was released?
Can we predict the genre of a song?
Can we identify which attributes are
the strongest in answering these
questions?
BACKGROUND
Hit Song Science
Genre Classification
Year Prediction
APPROACH
Use WEKA
Use the Million Song Dataset
WEKA
Machine Learning Software
Contains Visualization tools and algorithms
for data analysis and modeling
DATA
Million Song Data Set: commercial tracks
from 1922-2011,collected by LabROSA
EARLY CHALLENGES
Data in the wrong Format: HDF5 vs CSV
Lots of missing Data!
Almost half of the songs are missing year, a very
important attribute
Many attributes are being ignored because a
majority of the songs are missing data.
ArtistID -> Year?
ATTRIBUTES
The MSD contains 53 descriptive attributes for each song, along with 90
timbre attributes. Attributes were removed if they were not good
indicators of release year or genre, or if they were too closely tied to
what was being classified.
ATTRIBUTE MOTIVATION
Ranked Descriptive Attributes
•
•
•
•
•
•
Loudness (measured in decibels)
Duration (in seconds)
Tempo (estimated tempo in BPM)
Time Signature (estimated beats per bar)
Key
Mode (major or minor)
Timbre is the quality of a musical note or sound that distinguishes different types of
musical instruments, or voices. It is a complex notion also referred to as sound
color, texture, or tone quality, and is derived from the shape of a segment’s spectrotemporal surface, independently of pitch and loudness.
EARLY RESULTS – DESCRIPTIVE
ATTRIBUTES
Discretized into 6 decades; 1960-1970, 1970-1980, etc.
Baseline (Chance selection): 16.67%
First Tests: 6-9% correctly classified
More recent Tests: 25-30%
Why Random Forest and BayesNet?
EARLY RESULTS
TIMBRE RESULTS
GENRE PREDICTION
Genres:
Classic pop and rock
Classical
Dance and Electronica
Folk
Hip-Hop
Jazz
Metal
Pop
Rock and Indie
Soul and Reggae
GENRE PREDICTION RESULTS
CONCLUSIONS & FUTURE WORK
Timbre Attributes are better than Descriptive Attributes – Why?
Taste Profile
Lyrical/Emotional Content
Tag Dataset
QUESTIONS?