A content-based System for Music Recommendation and

Download Report

Transcript A content-based System for Music Recommendation and

A content-based System for Music Recommendation
and Visualization of User Preference Working on
Semantic Notions
Dmitry Bogdanov, Martin Haro, Ferdinand Fuhrmann Anna Xambo, Emilia Gomez, Perfecto Herrera
Music Technology Group Universitat Pompeu Fabra Roc Boronat, 138, 08018 Barcelona, Spain
{name.surname}@upf.edu
Presented By: Thay Setha
Content-Based Multimedia Indexing (CBMI), 2011 9th International Workshop
1
Outline
1.
2.
3.
4.
5.
Introduction
Last.fm & SoundCloud
System Architecture
Implementation
Conclusion
2
Introduction (1/2)
• Rapid growth of music available in the Internet stores
already exceeds 10 millions.
• This requires effective means
for browsing and search in
music collections.
• This paper present a contentbased system for music
recommendation and
visualization of user preferences
3
Introduction (2/2)
• System exploits content-based information extract
from audio signal.
– High level semantic descriptors of music inferred from lowlevel timbral, temporal, and tonal feature.
• System obtains preference set of music by giving user
profile on popular online music services, Last.fm and
SoundCloud.
• High level Semantic descriptors including genres,
musical culture, moods, instrumentation types,
rhythm, and tempo information are extracted from
preference set of a user and then exploited to
recommend similar music and visualize musical
preferences.
4
Last.fm & SoundCloud (1/2)
 Last.fm
 Established music recommender with an extensive
number of users and large playable music collection.
 Provides means for both monitoring listening statistics
and social tagging.
 Last.fm API
 http://www.last.fm/home
5
Last.fm & SoundCloud (2/2)
 SoundCloud
 Platform which allows users (mostly musicians) to
collaborate, promote, and distribute their music.
 Allows users to upload their own tracks or mark tracks
as their favorite.
 SoundCloud API
 http://soundcloud.com/
6
System Architecture (1/11)
• There are four parts of system for music
recommendation and visualization




Data gathering
Audio analysis
Music recommendation
Preference visualization
7
System Architecture (2/11)
8
System Architecture (3/11)
 Data gathering
i.
Specify his/her account
name in both Last.fm &
SoundCloud
ii.
Using Last.fm API & Sound
Cloud API to retrieve the
tracks.
9
System Architecture (4/11)
• Different types of tracks can be used to infer the
user’s preference set:
– Tracks marked as favorite by the user on Last.fm
– Tracks listened most by the user according to their
Last.fm’s statistics.
– Tracks marked as favorites by the user on SoundCloud
– Tracks uploaded by the user on SoundCloud
• Result:
URLs of the tracks to be included in
the preference set using Last.fm and
SoundCloud APIs.
10
System Architecture (5/11)
Audio Analysis
i.
ii.
Extraction of low level audio features
such as timbral, temporal, tonal, . . .
Then running a number of
classification tasks using support
vector machines trained on ground
truth information about genres,
musical
culture,
moods,
instrumentation, rhythm, and tempo.
Technically, we use Canoris API we
can obtain high level semantic
descriptors for each track from the
user’s preference set such as genres,
musical culture, moods,
instrumentation, rhythm, and tempo.
11
System Architecture (6/11)
• Canoris offers functionality in both
the analysis and synthesis of sound
and music.
– Analysis
 High and low level analysis: upload any sound file and retrieve high
level descriptors like genre, mood, . . .
 Similarity Search: Put all your files in a collection and see which
sound and songs are similar to each other.
– Synthesis
 Voice synthesis
 Voice Transform
12
System Architecture (7/11)
Music Recommendation
i.
We employ an in-house
music collection of 50.000
music excerpts.
ii. Using Canoris API to retrieve
the same high level semantic
descriptors as used for the
preference set.
iii. The system searches for tracks inside the in-house collection with
the smallest semantic distance to any of the tracks in the
preference set.
iv. Recommendation outcomes are presented to user including
metadata of tracks, audio previews, and the reason why this track
13
was recommended
System Architecture (8/11)
• Semantic Distance (Classifier-Based Distance)
– First Step:
 Apply multi-class support vector machines (SVMs) to infer
different groups of musical dimensions such as genre and musical
culture, moods and instruments, and rhythm and tempo.
 As an output, the classifier provides probability values of classes
(musical dimension) on which it was trained.
– Second Step:
 Define distance operating on a formed high-level semantic space.
 We select cosine distance (CLAS-Cos), Pearson correlation distance
(CLAS-Pears) and Spearman’s rho correlation distance (CLASSpear).
14
System Architecture (9/11)
Fig.1. General Schema of CLAS distance. Given two songs X and Y, low-level audio descriptors are extracted, a number of SVM classifications
are run based on ground truth music collections, and high-level representations, containing probabilities of classes for each classifier,
are obtained. A distance between X and Y is calculated with correlation distances such as Pearson correlation distance.
15
System Architecture (10/11)
 Preference Visualization



To visualize a user’s musical
preferences in the form of a
humanoid cartoon character,
the Musical Avatar.
Selected high level semantic descriptors of each track are
summarized across all tracks in the preference set.
Mapping descriptors to the visual elements of the avatar,
which are implemented using Processing.
16
System Architecture (11/11)
Calculate low level
feature by extract
frame by frame
Summarized means &
variance across all
frames
Using SVM perform regression
by suitable classifier produce
semantic dimension.
High level semantic descriptor
space contain probability
estimate for each class
Figure 1: Block-diagram of the proposed methodology. The user’s music tracks are analyzed and represented with low-level audio features which are
Later transformed, by means of classifiers, into semantic descriptors. The individual track descriptors are furthermore summarized into the user profile
Which is finally mapped to a series of graphical features of the Musical Avatar.
17
Implementation
Demo System: http://musrec.upf.edu/avatar
Figure 4. Screenshots of (a) the recommendation output and (b) the generated preference visualization returned to a user.
18
Conclusion
 They presented a system for music recommendation and
user preference visualization.
• Operating on content based information extracted from
audio with high-level semantic descriptor
 System employs the Last.fm and SoundCloud APIs to
generate semantic user models.
 System generates musical recommendations, relying on a
semantic similarity measure between music tracks.
 Musical preference of the users are visualized in the form
of Musical Avatars.
19
Thank You!
20