ArticlePDF Available

Automatic Playlist Generation from Self-Organizing Music Map

Authors:

Abstract and Figures

In this paper, we propose a method for automatic generation of music playlist that reflects a user's subjective preferences. The playlist is composed on top of a "Music Map" where hundreds of songs from various genres, which are subjectively characterized by the user, are topologically aligned. A playlist is generated by designing a travel path in the Music Map, visiting one song after another in a certain order. The topological nature of the Music Map ensures that the form of the travel path gives a certain character to the generated playlist. Thus, problem of generating a playlist that fulfills a given criterion can be transformed into a problem of finding an optimum path, similar to the Traveling Salesman Problem (TSP). In the experiments, we construct a Music Map containing 500 songs and automatically generate a playlist according to various criteria.
Content may be subject to copyright.
PAPER
Automatic Playlist Generation from Self-Organizing Music Map
Pitoyo Hartono1and Ryo Yoshitake2
1Chukyo University, 101 Tokodachi, Kaizu-cho, Toyota-shi, Aichi 470-0393, Japan
2Nara Institute of Science and Technology, 8916 Takayama-cho, Ikoma-shi, Nara 630–0101, Japan
E-mail: hartono@ieee.org
Abstract In this paper, we propose a method for automatic generation of music playlist that reflects a
user’s subjective preferences. The playlist is composed on top of a ”Music Map” where hundreds of songs
from various genres, which are subjectively characterized by the user, are topologically aligned. A playlist
is generated by designing a travel path in the Music Map, visiting one song after another in a certain
order. The topological nature of the Music Map ensures that the form of the travel path gives a certain
character to the generated playlist. Thus, problem of generating a playlist that fulfills a given criterion
can be transformed into a problem of finding an optimum path, similar to the Traveling Salesman Problem
(TSP). In the experiments, we construct a Music Map containing 500 songs and automatically generate a
playlist according to various criteria.
Keywords: music playlist, self-organizing map, music map, path optimization
1. Introduction
In the past several years, efficient compression
techniques, low cost of Internet connections and avail-
ability of handheld computer devices allow users to
easily download and store massive volumes of digital
music files. While acquiring music files has become
very easy, managing and using these files have become
increasingly harder. Most of the commercially avail-
able music playing devices provide users with func-
tions to manage and create music playlists. However,
users usually have to manually create their playlists by
handpicking music files from their collections where
the files are organized according to fixed attributes,
such as name of artists, title of the songs and so on.
Some music playing devices are equipped with more
advanced features that take the preference of the user
into account based on past choices and then auto-
matically generates a playlist. While these features
significantly contribute to the usability of the device,
the characteristics of the generated playlist are gen-
erally not intuitive for the users, therefore it is not
easy for the users to compile a music playlist that
reflects the user’s preference. The objective of this
study is to propose a simple automatic playlist gener-
ation system that can be easily characterized by the
users. Here, the intended characteristic refers not to
the fine details of the songs’ attributes or playing or-
der, but to the more overall nature of the playlist that
can be expressed with simple semantics like ”similar
songs”, ”smooth transition of songs” and so on. In this
study, the playlist is automatically generated on a so-
called Music Map, which topologically aligns a collec-
tion of songs, characterized by subjective user-defined
attributes, in a two-dimensional grid. This Music Map
is created by utilizing Self-Organizing Map (SOM)
learning mechanism [1] by treating the attributes of
a song as a high-dimensional vector. Because the Mu-
sic Map is built on a two-dimensional space, it can be
visualized, and thus provides an intuitive presentation
of the organization of the collection. The Music Map
is then utilized to automatically generate a playlist
containing a subset of the collection. A playlist is au-
tomatically generated by traveling the map, visiting
song after song in a certain order. The topological or-
ganization of the map ensures that similar songs are
positioned close to each other, while different songs are
positioned apart from each other. Hence, the form of
the travel path is directly associated with the general
characteristics of the generated playlist. For exam-
ple, a travel path that covers a narrow area generates
a playlist containing similar songs, while travel path
that maintains a minimum distance between two ad-
jacent points produces a playlist with a smooth tran-
sition between two consecutive songs. It is clear that
the problem of characterizing a playlist can be directly
Journal of Signal Processing, Vol. 17, No. 1, January 2013 11
PAPER
Journal of Signal Processing, Vol.17, No.1, pp.11-19, January 2013
2Journal of Signal Processing, Vol. , No. ,
translated into a path planning problem, and thus, a
well-defined combinatorial optimization problem. As
long as we can formulate the intended characteristics
of the generated playlist, any combinatorial optimiza-
tion method can be applied to solve this problem. Due
to its simplicity, we chose Simulated Annealing (SA)
[2] to optimize the travel path based on a given evalu-
ation function that represents the intended character-
istics of the playlist.
Studies for generating a music playlist on top of a
map of music were reported in [3, 4, 5]. Our system
has many similarities mainly with PocketSOM[5], but
can be significantly distinguished by two features. The
first one is that in PocketSOM the songs are character-
ized by their acoustical attributes, while in our system
the songs can be characterized by users’ chosen at-
tributes. This implies that in our proposed system the
users have more freedom in expressing their personal
subjectivity. The second difference is that in Pock-
etSOM, the playlist is generated by the action of the
users to draw a path in the map, and then finely tune
the inclusions of particular songs. In our proposed sys-
tem, the playlist is automatically generated according
to an intuitive criterion given by the user. We consider
that PocketSOM focuses on the interactive interface
and controllability of the users, while we focus on the
intuitiveness and personalization flexibility. A music
playlist generation system using SA was proposed in
[6]. Although we also apply SA for the automatic
generation of playlists, our proposed system has sev-
eral differences with this system. The most significant
difference is the nature of the generated playlist; in
[6], the playlist accommodates the minute demands
of the user. For example, the first song should be
a jazz song, the second song should be released in
2000, and so on, whereas in our system the generated
playlist is based more on global characteristics, such
as degree of song similarity/diversity. The second dif-
ference is that, in our system, SA is executed not in
the original dimension of the song-attribute vector but
in the low-dimensional Music Map. By limiting the
search space to low-dimensional space, the searching
process can be significantly simplified, and the gener-
ated playlist can be visualized as a travel path in the
Music Map that can be given as an intuitive feedback
to the users. The recommendation system proposed
in [7] is also based on specific constraints on the songs,
but our system focuses more on the overall character-
istics of the playlist. The playlist generation system
constructed in [8, 9] takes a music ”seed” and builds a
playlist based on the similarity with the seed. In our
system, the similarity of the songs are inherently en-
coded in the map, and therefore the seed is not needed.
The playlist generator in [10] introduced a personal-
ized neural network that takes various attributes from
a music piece and a time parameter to calculate the
preference of the user with regard to a music file. A
time-dependent playlist is then generated based on the
user’s preference. In [11, 12], a recommendation sys-
tem based on the habits or the situation of the user
was proposed. Although our system also considers a
personalized playlist generation system, the user pref-
erences are reflected in the topological order of the
map, and therefore additional inputs are not needed.
In creating the playlist, one of the most important
aspects of the method is the similarity measure used
to compare two songs. We are aware that there are
several methods for measuring the similarity of songs
[13, 14, 15, 16, 17]. However, due to its simplicity
and intuitiveness, we utilized the Euclidean distance
to measure the similarity of the songs in our system.
Some multimedia retrieval systems [18, 19] applied
interactive and adaptive schemes to extract multime-
dia contents, not necessarily music, that match the
users’ criteria. While this idea is definitely interesting
and important for building automatic playlist genera-
tion systems, we take different approach. In the inter-
active schemes the users have to repeatedly interact
with the systems to extract the contents, while in our
system the users only have to give their criteria to the
systems. We understand that the lack of interactive
and training schemes prevents our systems in generat-
ing a finely tuned playlists, which is not our goal. Our
goal is to enable the users to easily characterize the
playlist without repeatedly fine tuning the selection of
the songs.
Some patented technologies [20, 21] assigned
”weighted” values to the songs in the database
and used these values to automatically generate the
playlists. Though this idea is useful and can be imple-
mented without giving extra burden to the users, the
selection criteria are often hidden from the users. In
our system, the control for characterizing the playlists
is given to the users, and the justification for the se-
lected songs in the playlists is visually presented to
the users as the travel paths. We consider that the
visual presentation of the songs selection offers a kind
of interaction between the systems and the users.
This paper is organized as follows. In Section 2, an
overview of the proposed system is explained. Section
3 discusses the formation of a Music Map and the
generation of a playlist from the map. Experiments of
generating playlists according to various criteria are
explained in Section 4 and conclusions and discussions
are presented in the final section.
2. Playlist Generation System
In this study, a playlist is automatically gener-
ated on top of a Music Map according to a user-
specified criterion. Hence, the proposed system is con-
structed from two procedures: Music Map generation
and playlist generation. Music Map generation is exe-
cuted by aligning many songs into a two-dimensional
12 Journal of Signal Processing, Vol. 17, No. 1, January 2013
3
grid according to the competitive training rule of SOM
[1]. The songs, characterized by attributes such as the
artist, the length of the song and beats per minute,
are aligned according to their similarities in the map
in a self-organizing manner. The training rule en-
sures that similar songs are aligned in close neighbor-
hoods, whereas songs that differ greatly are remotely
positioned. Because a song is characterized by user-
specified attributes, a change of attributes causes a
different alignment in the map for the same set of
songs. This implies that, because the users have a
freedom to characterize the songs, each Music Map re-
flects the subjective preference of the user. The Music
Map generation process is illustrated in Fig. 1, where
a node represents one or more similar songs.
artist length beats per minutes
song
feature vector
Music Map
Fig. 1 Music map generation
As illustrated in Fig. 2, the playlist is then gener-
ated by touring the Music Map according to a travel
path, so that songs that lie on the travel path are in-
cluded in the playlist in the order of the travel sched-
ule. Because in the Music Map, the songs are arranged
according to their similarities, the form of the travel
path directly characterizes the generated playlist, as
illustrated in Fig. 3 and Fig. 4. Figure 3 and Fig. 4
show travel paths that include 8 songs, where in the
former the tour covers a small area and the in the later
it covers a wider area. The topological characteristics
of the Music Map allows the path in Fig. 3 to produce
a playlist of 8 similar songs, whereas the path in Fig.
4 produces a playlist with more diversified songs.
It is obvious that the problem of characterizing a
playlist can be translated into the problem of plan-
ning a travel path that reflects the intended character
of the playlist. The problem of choosing an optimum
travel path for visiting Nsongs according to a given
criterion can be treated as a combinatorial optimiza-
tion problem similar to Traveling Salesman Problem
(TSP), which is solvable by any standard optimiza-
tion method. It should be stressed that we do not
try to always generate travel paths that are strictly
satisfying a solution of TSP. However, we argue that
the calculation complexity for path generation in our
Evaluation Function
path planning
Playlist
playlist’s characteristic
Fig. 2 Playlist generation
study is comparable to that of TSP. For generating
a playlist, we have to choose Nsongs from a total
of Msongs in the database, thus choosing one of the
possible M!
(MN)!N!combinations where each combina-
tion yields N! possible paths, thus the computational
complexity of this problem is M!
(MN)!N!N!. Because
the complexity of TSP with Ncities is (N1)!
2. It is
obvious that for N<M and (MN)! >2M!,
(N1)!
2<M!
(MN)!N!N!<(M1)!
2(1)
Music MapMusic Map
Fig. 3 Small diversity
Music MapMusic Map
Fig. 4 Large diversity
Equation(1) shows that the complexity of the path
planning problem in this problem is between a TSP
with Ncities and a TSP with (M>N) cities. Thus,
we can argue that the problem here is comparable to
TSP.
3. Music Map and Path Planning
3.1 Music map
In this study, we utilize the conventional learn-
ing algorithm of SOM to generate the Music Map.
SOM has the characteristic of preserving the topo-
logical relationship of high-dimensional data in the
low-dimensional map. That is, data similar in their
Journal of Signal Processing, Vol. 17, No. 1, January 2013 13
4Journal of Signal Processing, Vol. , No. ,
original high dimension space are projected so as to
remain in each other’s vicinity in the low-dimensional
map, and conversely, data that are far away from each
other in their original dimensionality are also posi-
tioned away from each other in the map. SOM is
often used to visualize high-dimensional data without
compromising their topological order. In this study,
we utilize the mapping characteristics of SOM to cre-
ate a Music Map in which each song is treated as an
S-dimensional vector, where Sis the number of at-
tributes that characterize each song. The map con-
tains Rx×Ry,S-dimensional reference vectors (il-
lustrated as circles in Fig. 1), arranged in a two-
dimensional grid.
In the learning process of SOM, for each presenta-
tion of an input vector, a winner is chosen from the
reference vectors, as follows:
win =arg min
j||Wj(t)X(t)|| (2)
In Eq.(2), win is the index of the most similar ref-
erence vector with the given input X, and Wj(t) is the
jth reference vector at time t.
After a winner were chosen, the reference vector is
modified as follows:
Wwin(t+ 1) = Wwin (t)+η(t)(X(t)Wwin (t)) (3)
where η(t) denotes the learning rate, which is a mono-
tonically decreasing function.
Equation (3) implies that the winner is modified
toward the input vector. It is obvious that after the
learning process, the winner is the representation of
the input vector (in this case, the song) in the Music
Map.
To ensure the topological correctness of the song
data in the Music Map, not only is the winner modi-
fied, but also the rest of the reference vectors, as fol-
lows:
Wi(t+ 1) = Wi(t)+η(t)Λ(dist(win, i))(X(t)Wi(t))
Λ(x)=exp(
x2
σ(t)) (4)
In Eq.(4), dist(win, i) denotes the distance be-
tween the position of the winner and the ith reference
vector on Music Map, Λ is the neighborhood function
and σis the monotonically decreasing scaling function.
Equation (4) implies that reference vectors positioned
in the vicinity of the winner are also modified toward
the winner, albeit with reduced intensities. It is ob-
vious that the reference vectors in the neighborhood
of a winner represent input vectors (songs) similar to
the one represented by the winner. Thus, by iterating
the modification process, a Music Map, which reflects
the topological characteristics of songs, can gradually
be generated.
3.2 Travel path and playlist
In the proposed system, a playlist is generated by
planning a travel path on the Music Map. The form
of the travel path directly characterizes the generated
playlist. A travel path that tours a limited area pro-
duces a playlist with similar songs, and the travel path
that covers a large area generates a playlist with rich
diversity.
As the character of a playlist is defined by the form
of the travel path, for characterizing a playlist, we
must plan a travel path in the Music Map. For this
purpose, a playlist containing Nsongs can be repre-
sented as a travel path, P, containing the coordinates
of Nsongs in the Music Map, as follows:
P={p1,p
2,...,p
N}(5)
pi=(xi,y
i)
where pi(xi,y
i) is the coordinates of the ith visited
song in the Music Map. Consequently, the total dis-
tance length of a travel path, L(P), can be calculated
as follows:
L(P)=
N1
i=1
||pipi+1|| (6)
Because we use a two-dimensional map, this Mu-
sic Map can be visualized. In this study, the playlist
is automatically generated by creating a travel path
that fulfills mathematically defined evaluation func-
tions. The playlist generation problem is treated as
a path planning problem, where the intended charac-
teristic of the playlist is transferred into an evaluation
function. For example, the automatic generation of a
playlist containing Nsimilar songs can be transferred
into a path planning problem that minimizes the eval-
uation function, L(P) in Eq. (6). A varied playlist
with the same number of songs can be generated by
choosing a travel path Pthat maximizes L(P).
For solving the optimum path, any known deter-
ministic or heuristic combinatorial optimization meth-
ods, such as Dijkstras Algorithm, Evolutionary Strat-
egy, Genetic Algorithm and so on, can be applied.
In this study, we applied SA. It should be stressed
that we choose SA because of its simplicity, but not
because of its superiority to other methods. The ef-
ficiency of SA is shown with the short computational
time in generating the playlists.
In generating a path in the Music Map, it is nec-
essary to specify the number of songs, N, and the
evaluation function that reflects the intended charac-
teristic of the playlist. Initially, a path containing N
songs is randomly generated. For every iteration of
SA at time t, one song in path P(t) is randomly mu-
tated by replacing it with another song that is not
already included in the path, thus generating a new
14 Journal of Signal Processing, Vol. 17, No. 1, January 2013
5
path, P(t+ 1). By calculating the difference between
the value of the given evaluation function with re-
spect to P(t) and P(t+ 1), we can measure the im-
provement brought by the mutation. The measure
of the improvement can be described with function
D(P(t),P(t+1)), which generates a positive value for
an improvement and a negative value for a deterio-
ration. The absolute value of function Dshows the
degree of improvement or deterioration. When a new
path improves the evaluation, the searching process is
restarted from this new path. Conversely, when the
evaluation deteriorates, the probability value for ac-
cepting the inferior path is calculated as follows:
pa= exp(D(P(t),P(t+ 1))
T(t)) (7)
T(t)=aT(t1)
In Eq. (7), Tis a parameter called temperature,
which is gradually reduced along with the number of
iterations in SA., where ais empirically set to 0.95.
The search for a new path is restarted from an inferior
path P(t+1) with a probability of pa, whereas the infe-
rior path is discarded and the search is restarted from
the original path, P(t), with a probability of 1 pa.
The process of accepting an inferior path with a small
probability is intended to prevent the search process
from being trapped at a local optimum. The optimiza-
tion process is iterated when it reached empirically set
target of 10000.
4. Experiment
In the experiment, a Music Map with the size of
22 ×22 is trained with 500 songs from various gen-
res, artists and countries. Each song is characterized
by 9 attributes: length of the song, beats per minute
(BPM), gender of the artist(s), number of artists (1
for a solo artist or the number of band members
for groups), variables distinguishing Japanese artists
from non-Japanese artists and active from non-active
artists, and the subjective rating of the song defined
by the user. The first two attributes are the phys-
ical characteristics of the songs, the next four at-
tributes reflect the cultural aspects of the song, and
the final one is the subjective preference of the user.
These combined attributes are then expressed as a 14-
dimensional vector.
To evaluate the accuracy of the map, an error
value, E, that describes the diffraction between the
vector for the ith song, Xi, with its reference vector
in Music Map, Xwin(i)(t), at time tis defined as fol-
lows:
E(t)= 1
N
N
i=1
||XiWwin(i)(t)|| (8)
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
x 10
4
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
Epoch
Average Error
Music Map Learning Curve
Fig. 5 Learning curve
Figure 5 shows the learning curve of a Music Map
generation over 20,000 epochs. In each epoch, every
song is presented for training the map. It is obvious
that the accuracy of the map gradually increases with
the learning iterations.
-Etude in E major, Op.10 No. 3
"Tristesse"
(Chopin)
-Le Nozze di Figaro
(Mozart)
Adagio for Strings
(Samuel Barber)
-Gymnopedie
Erik Satie
-Bohemian Rhapsody
(Queen)
-
()���
-Lovers in Japan
(Allister
- Draw the line
(Aerosmith)
- Lost!
(Coldplay
- Hysteria
(Def Leppard)
- Te ar I t Up
(Andrew W.K)
)
)
- Don't Know Why
(Norah Jones)
- Lacrima
(Kokia)
- Hot
(Avril Lavigne)
- Castle Imitation
(Onizuka Chihiro)
A
B
C
Fig. 6 Music map
The generated Music Map is shown in Fig. 6.
We can observe several neighborhoods in the Music
Map, where songs that share similar characteristics
are closely clustered. For example, area Acontains
songs of female artists, classical music is clustered in
area B, and rock music is positioned in area C. There
are also several vacant spaces in the map where no
songs are aligned; this space can be considered as the
borders between neighboring clusters.
As explained in the previous section, a playlist is
created by planning a travel path in the Music Map.
It should be noted that a node may represent more
than one similar songs. If such a node is included in
the travel path, the system will randomly choose one
Journal of Signal Processing, Vol. 17, No. 1, January 2013 15
6Journal of Signal Processing, Vol. , No. ,
of the songs to be included in the playlist.
We ran experiments to automatically generate sev-
eral kinds of playlists, each with a unique characteris-
tic.
4.1 Playlist of similar songs
The objective of this experiment is to automat-
ically generate a playlist containing Nsimilar songs.
This objective can be achieved by minimizing the total
distance of the tour in the Music Map. This problem
is similar to TSP, except that only Nsongs from the
whole collection are chosen and some positions in map
are allowed to be visited more than once if they con-
tain several songs. The evaluation function for this
objective is defined as follows:
E1=
N
i=1
||pipi+1|| (9)
pN+1 =p1
Setting N= 30 and executing SA to search for
a path that minimizes E1, we obtain the travel path
shown in Fig. 7. The optimization process using SA
is shown in Fig. 8.
Fig. 7 E1minimization path (N= 30)
Figures 9 and 10 show the paths for N= 40 and
N= 50, respectively, and their optimization processes
are shown in Fig. 11.
10
0
10
2
10
4
0
100
200
300
400
log(Step)
Total Distance
Fig. 8 E1minimization process (N= 30)
Fig. 9 E1minimization
path (N= 40)
Fig. 10 E1minimization
path (N= 50)
10
0
10
1
10
2
10
3
10
4
0
100
200
300
400
500
600
700
log(Step)
Total Distance
N=40
N=50
Fig. 11 E1minimization process (N= 40, N= 50)
We also generate a playlist according to the evalu-
ation function in Eq. 10.
E2=Av(P)+SD(P) (10)
Av(P)= 1
N
N1
i=1
||pipi+1||
SD2(P)= 1
N
N1
i=1
((||pipi+1||)Av(P))2
In this case, the total travel distance and the stan-
dard deviation of the distance between two consecu-
tive songs are minimized, resulting in a short travel
path with a regulated distance between songs. The
actual travel paths for 40 and 50 songs are shown in
Fig. 12 and Fig. 13. By regulating the distance be-
tween two consecutive songs, we can generate travel
paths that are different from the ones optimized with
unregulated distance, as in Fig. 9 and Fig. 10.
4.2 Playlist with varied songs
In this experiment we create a playlist containing
a wide variety of songs. A playlist with such charac-
teristics can be created by exploring various areas of
the Music Map. The travel path does not have to be
a loop, but for simplicity we maximize the evaluation
function in Eq. (10) for a tour of N= 30 songs and
obtain the path shown in Fig. 14. The maximiza-
tion process with SA is shown in Fig. 15. Figure 14
16 Journal of Signal Processing, Vol. 17, No. 1, January 2013
7
Fig. 12 Path for E2
(N= 40)
Fig. 13 Path for E2
(N= 50)
clearly shows that the generated path almost always
moves diagonally across the map, which consequently
maximizes the total distance.
Fig. 14 E1Maximization path
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
200
300
400
500
600
700
800
900
Step
Evaluation Value E
1
Fig. 15 E1maximization process
We also generate a playlist containing various kinds
of songs with a uniform interval between two consec-
utive songs; that is, we maximize E2in Eq. (11). The
generated path, shown in Fig. 16, is clearly different
from the unregulated path in Fig. 14.
E2=L(P)
SD(P)(11)
4.3 Gradually changing playlist
In this experiment, we generate a gradually chang-
ing playlist that covers many kinds of songs. Because
the playlist has to cover many songs, it is logical that
Fig. 16 E2maximization path
0 500 1000 1500 2000 2500 3000
0
100
200
300
400
500
600
700
800
900
Step
Evaluation Value E
2
Fig. 17 E2maximization process
the first song is far from the last song, but the dis-
tance between two consecutive songs is small, hence
the maximization of E3in Eq. (12).
E3=||pipN||
Av(P)(12)
The actual paths for N= 10 and N= 30 are
shown in Fig. 18 and Fig. 19, respectively and their
maximization process is shown in Fig. 20.
Fig. 18 E3maximization
path (N= 10)
Fig. 19 E3maximization
path (N= 30)
The generated playlist for N= 10 is shown in Fig.
21, from which it is obvious that the change between
two consecutive songs is gradual but the first song and
the last songs differ significantly.
All experiments in this study were ran on Windows
machine, with 2.4GHz CPU and 2GB of RAM. The
computational time for generating the Musical Map
was about 3 seconds, while the computational times
Journal of Signal Processing, Vol. 17, No. 1, January 2013 17
8Journal of Signal Processing, Vol. , No. ,
10
0
10
1
10
2
10
3
10
4
0
5
10
15
20
25
30
35
log(Step)
Evaluation Value E
3
N=10
N=30
Fig. 20 E3maximization process
for generating the playlists were between 100 and 300
msec.
Amor, Amor (Gipsy Kings)
Bambleo (Gipsy Kings)
Faena (Gipsy Kings)
Quiero Saber (Gipsy Kings)
Un Amor (Gipsy Kings)
When I Fall in Love (Bill Evans)
Witchcraft (Bill Evans)
Don't Stop Til You Get Enough
(Michael Jackson)
Wanna Be Startin Somethin
(Michael Jackson)
1ere Gymnopedie (Erik Satie)
Fig. 21 Playlist for E3maximization (N= 10)
5. Conclusions and Future Works
In this study, we proposed a method for automat-
ically generating a playlist from a collection of songs
aligned as a two-dimensional Music Map. The topo-
logical correctness of the map ensures that the charac-
teristics of the playlists are decided by the form of the
tour paths in the Music Map. By translating the prob-
lem of characterizing the playlist into a combinatorial
optimization problem, we can easily create playlists
that fulfill the users’ criteria. In contrast with many
existing automatic playlist generation systems, which
often do not give any information to users about the
playlist generation criteria, in the proposed system,
the users are allowed to characterize the playlist. Fur-
thermore the system gives a visual feedback to the
users in the form of the travel path in Music Map to
guarantee that the playlist fulfills the intention of the
users. The freedom of the users to freely characterize
the songs ensure that the created Music Map form a
kind of a personalized preference-space. Furthermore,
the ability of the users to intuitively specify a partic-
ular touring path in the Music Map, to some extent
ensures that the created playlist reflects the subjec-
tive preference of the users. This ability to reflect
the subjective preference of the users distinguishes
the proposed system with random-playlist sytems and
other systems that do support personalization. One
of the main limitations of the proposed playlist is the
requirement that the intended characteristics of the
playlist have to be expressed as mathematical formu-
las. To increase the usability of the proposed system
we plan to develop a method that relates semantics
into mathematical expressions.
Currently the map in this study has a fixed capac-
ity, and after the training process it is not possible
to add new songs without having to retrain the map
using all the old and new songs. Although the retrain-
ing time is considerably small, we are aware that this
problem limits the usability of the proposed system.
Hence, in the future we plan to improve the interface
of this system to allow the users to add songs with
minimal modification of the map.
We also plan to study about the effect of the selec-
tion of the songs’ attributes. We are aware that tech-
nically the selection of the attibutes does not have any
influence on the generation of the map and the opti-
mization of the travel path. However, we believe that
there is a strong psychological correlation between the
selections of the attributes with the satisfaction of the
users with respect to the automatic generation of the
playlist.
Acknowledgements
P.H. thanks Artificial Intelligence Research Promo-
tion Foundation and The Hibi Science Foundation for
partially supporting this study.
References
[1] T .Kohonen: Self-organized Formation of Topologically
Correct Feature Maps, Biological Cybernetics, Vol. 43, pp.
59-69, 1982.
[2] S. Kirkpatrick et al.: Optimization by Simulated Anneal-
ing, Science, Vol. 220, No. 4598, pp. 671-680, 1983.
[3] J. Frank, et al.: Map-Based Music Interfaces for Mobile
Devices, Proceedings of the 16th ACM Int. Conf. on Mul-
timedia, pp. 981-982, 2008.
[4] J. Frank et al.: Ambient Music Experience in Real and
Virtual Worlds Using Audio Similarity, Proceedings of the
1st ACM Int. Workshop on Semantic Ambient Media Ex-
periences, pp. 9-16, 2008.
18 Journal of Signal Processing, Vol. 17, No. 1, January 2013
9
[5] R. Neumayer et al.: PlaySOM and PocketSOM Player-
alternative interface to large music collections, Proceed-
ings of the Int. Conference on Music Information Re-
trieval, pp. 618-623, 2005.
[6] S. Pauws et al.: Music Playlist Generation by Adapted
Simulated Annealing, Information Science, Vol. 178, No.
3, pp. 647-662, 2008.
[7] J. J. Aucouturier and F. Pachet: Scaling Up Music
Playlist Generation, Proceedings of IEEE Int. Conf. on
Multimedia and Expo, pp. 105-108, 2002.
[8] J. C. Platt et al.: Learning a Gaussian Process Prior for
Automatically Generating Music Playlists, Advances in
Neural Information Processing Systems, pp. 1425-1432,
2002.
[9] A. Flexer et al.: Playlist Generation Using Start and End
Songs, Proceedings of the 9th Int. Conf. on Music Infor-
mation Retrieval, pp. 173-178, 2008.
[10] N.H. Liu et al.: An Intelligent Music Playlist Generator
Based on the Time Parameter with Artificial Neural Net-
works, Expert Systems with Applications, Vol. 37, No. 4,
pp. 2815-2825, 2010.
[11] A. Andric and G. Haus: Automatic Playlist Generation
Based on Tracking Users Listening Habits, Multimedia
Tools and Applications, Vol. 29, No. 2, pp. 127-151, 2006.
[12] K. Kaji et al.: A Music Recommendation System Based on
Annotation about Listeners’ Preferences and Situations,
Proceedings of the First Int. Conf. on Automated Produc-
tion of Media Content for Multi-Channel Distribution, pp.
231-234, 2005.
[13] A. Ghias et al.: Smith. Query by Humming: Music In-
formation Retrieval in An Audio Database, Proceedings
of the 3rd ACM Int. Conf. on Multimedia, pp. 231-236,
1995.
[14] K. Hoashi et al.: Feature Space Modification for Content-
Based Music Retrieval Based on User Preferences, Pro-
ceedings of IEEE Int. Conf. on Acoustics, Speech and Sig-
nal Processing, pp. 517-520, 2006.
[15] M. Goto and T. Goto: MUSICREAM: New Music Play-
back Interface for Streaming, Sticking, Sorting and Re-
calling Musical Pieces, Proceedings of the Int. Conf. on
Music Information Retrieval, pp. 404-411, 2005.
[16] G. Tzanetakis and P. Cook: Musical Genre Classification
of Audio Signals, IEEE Trans. on Speech And Audio Pro-
cessing, Vol. 10, No. 5, pp. 293-302, 2002.
[17] A. Berenzweig et al.: A Large-Scale Evaluation of Acous-
tic and Subjective Music-Similarity Measures, Computer
Music Journal, Vol. 28, No. 2 pp. 63-76, 2004.
[18] T. S. Huang et al.: Active Learning for Interactive Multi-
media Retrieval, Proceedings of IEEE, Vol. 6, No. 4, pp.
648-667, 2008.
[19] H. Qi et al.: Sound Database Retrieved by Sound, Acous-
tical Science and Technology, Vol. 23, No. 6, pp. 293-300,
2002.
[20] S. Ward: System and Method for Creating Dynamic
Playlists, U.S. Patent US6526411, Issued February 25,
2003.
[21] D. Plastina et al.: Media Item Subgroup Generation from
a Library, U.S. Patent US7756388, Issued July 13, 2010.
Pitoyo Hartono received his
B. Eng, M. Eng and Dr. Eng. from
Departement of Applied Physics,
Waseda University in 1993, 1995 and
2002, repectively. He worked as
a software engineer in Hitachi Ltd.
between 1995-1998. He was a re-
search associate and visiting lecturer
in Waseda University between 2001
and 2005. From 2005 to 2010 he
was an Associate Professor in Future
University Hakodate, and from 2010
he has been a Professor in School of
Information Science and Technology, Chukyo University. His
research interests include, theory and applications of computa-
tional intelligence, signal processing and adaptive robotics.
Ryo Yoshitake received his
Bachelor and Master Degrees from
Department of Imformation Science,
Future University Hakodate in 2010
and Nara Institute Science and
Technology in 2012, repectively. In
2012 he joined DaiNippon Printing
as a software engineer.
(Received August 7 2012; revised October 12 2012)
Journal of Signal Processing, Vol. 17, No. 1, January 2013 19
... A novel approach is proposed by Hartono and Yoshitake [98], where firstly the songs are topologically aligned in a self-organising map based on their similarities and then the task of playlist generation is converted into the task of planning a path in the generated map, similar to the travelling salesman problem. The character of the playlist is defined by the form of the travel path and thus can take into consideration user's criteria as well. ...
... Using the J48 implemenation of C4.5 in WEKA[93].98 ...
Thesis
Full-text available
Since the last decade, data collection is becoming more pervasive, passive and easier to perform. This is resulting in the rise of data wherein a user performs some activities in a sequence, such as locations visited, physical activities performed, and modes of transport taken. In such cases, activities are often performed in a particular order, and each activity in turn may influence the subsequent activities to be performed. Moreover, such activities may be associated with multiple features or contexts, such as location, time, weather, etc. The order encoded in such data, along with the context, capture important information when it comes to modelling the preferences and personal habits of users. Traditional recommender systems, however, typically do not consider the order in which users perform activities and there is little work which considers both sequence and context simultaneously. We believe that if recommender systems can utilise such patterns, more relevant recommendations can be made for users. In this work, a generic recommendation framework is proposed which leverages both sequences and context in user activity data for activity recommendation and the performance of the framework on real-world datasets is investigated. To model user activities, a semantic view of the users past activities as a timeline of activity objects is presented. An essential step in the recommendation process is finding patterns in past activities performed which are closely aligned to the recent activities undertaken by the user. To calculate the distance between timelines, a novel two-level distance metric is proposed which calculates distance with respect to the order of the activities as well as the context features associated with each activity occurrence. Further, a supervised learning approach is proposed to personalise the match- ing of timelines for each user based on the degree of regularity, repetition and variance in their activity sequences. The proposed framework supports a content-based as well as a collaborative approach to accommodate different application domains. The efficacy of the proposed generic activity recommendation framework to recommend the next activity, the next sequence of activities, and to recommend the context associated with activities, is demonstrated using real-world datasets from multiple domains.
... • Similar Objects: Find similar tracks or artists, available, e.g., on Spotify. This type of recommendation can often be found in the Music Information Retrieval literature, e.g., Germain and Chakareski (2013); Hartono and Yoshitake (2013); Moore et al. (2012). ...
... • Similar Objects: Find similar tracks or artists, available, e.g., on Spotify. This type of recommendation can often be found in the Music Information Retrieval literature, e.g., Germain and Chakareski (2013); Hartono and Yoshitake (2013); Moore et al. (2012). ...
... Clustering techniques also rely on similarity functions, see, e.g., [Pauws and Eggen 2002] and [Dopler et al. 2008] and for example exploit similarities based on artist information and other meta-data or audio features. [Ragno et al. 2005], [Pohle et al. 2007] and [Hartono and Yoshitake 2013] finally represent examples of a very specific way of using the track similarities, where the goal is to optimize the playlist in a way that the sum of the similarities is maximized. ...
Article
Full-text available
Most of the time when we listen to music on the radio or on our portable devices, the order in which the tracks are played is governed by so-called playlists. These playlists are basically sequences of tracks that traditionally are designed manually and whose organization is based on some underlying logic or theme. With the digitalization of music and the availability of various types of additional track-related information on the Web, new opportunities have emerged on how to automate the playlist creation process. Correspondingly, a number of proposals for automated playlist generation have been made in the literature during the past decade. These approaches vary both with respect to which kind of data they rely on and which types of algorithms they use. In this article, we review the literature on automated playlist generation and categorize the existing approaches. Furthermore, we discuss the evaluation designs that are used today in research to assess the quality of the generated playlists. Finally, we report the results of a comparative evaluation of typical playlist generation schemes based on historical data. Our results show that track and artist popularity can play a dominant role and that additional measures are required to better characterize and compare the quality of automatically generated playlists.
Article
Since the last decade, data collection is becoming more pervasive, passive and easier to perform. This is resulting in the rise of data wherein a user performs some activities in a sequence, such as locations visited, physical activities performed, and modes of transport taken. In such cases, activities are often performed in a particular order, and each activity in turn may influence the subsequent activities to be performed. Moreover, such activities may be associated with multiple features or contexts, such as location, time, weather, etc. The order encoded in such data, along with the context, capture important information when it comes to modelling the preferences and personal habits of users. Traditional recommender systems, however, typically do not consider the order in which users perform activities and there is little work which considers both sequence and context simultaneously. In this work, a generic recommendation framework is proposed which leverages both sequences and context in user activity data for activity recommendation. To model user activities, a semantic view of the user’s past activities as a timeline of activity objects is presented. An essential step in the recommendation process is finding patterns in past activities performed which are closely aligned to the recent activities undertaken by the user. To calculate the distance between timelines, a novel two-level distance metric is presented which calculates distance with respect to the order of the activities as well as the context features associated with each activity occurrence. The efficacy of the proposed activity recommendation framework in various recommendation scenarios, is demonstrated using real-world datasets from multiple domains.
Article
Full-text available
In this paper, we propose a sound-database system, which is able to extract stored data using sound as a key for the query. This ability realizes the sound extraction without having to specify the acoustical characteristics of the sound. The system repetitively searches and presents sounds, which have similarity in timbre to the key sound, until the user finds a satisfactory sample. The parameters that characterize a sound's timbre, which is a psychoacoustical factor for sound perception, are adopted as the sound's indices in the database and used for similarity matching in the searching process. Because the definition of similarity in sounds differs from user to user, the proposed system is equipped with an adaptive preference-weighted searching mechanism that adapts its searching focus based on the user's preference. Because of the ability of the proposed system to realize an intuitive query, this system can be broadly used by a user without special acoustical knowledge.
Conference Paper
Full-text available
This paper proposes a feature space modification method for feature extraction of music, which is effective for the development of a content-based music information retrieval (MIR) system based on user preferences. The proposed method conducts clustering of all songs in the music collection, and utilizes the resulting cluster IDs as training data for feature space modification, and is capable to automatically generate a feature space which is suitable to the content of any music collection. Experiment results prove that the proposed method improves accuracy of user preference based MIR
Article
Full-text available
We present the design of an algorithm for use in an interactive music system that automatically generates music playlists that fit the music preferences of a user. To this end, we introduce a formal model, define the problem of automatic playlist generation (APG), and prove its NP-hardness. We use a local search (LS) procedure employing a heuristic improvement to standard simulated annealing (SA) to solve the APG problem. In order to employ this LS procedure, we introduce an optimization variant of the APG problem, which includes the definition of penalty functions and a neighborhood structure. To improve upon the performance of the standard SA algorithm, we incorporated three heuristics referred to as song domain reduction, partial constraint voting, and a two-level neighborhood structure. We evaluate the developed algorithm by comparing it to a previously developed approach based on constraint satisfaction (CS), both in terms of run time performance and quality of the solutions. For the latter we not only considered the penalty of the resulting solutions, but we also performed a conclusive user evaluation to assess the subjective quality of the playlists generated by both algorithms. In all tests, the LS algorithm was shown to be a dramatic improvement over the CS algorithm.
Conference Paper
Full-text available
This paper presents AutoDJ: a system for automatically generating music play-lists based on one or more seed songs selected by a user. AutoDJ uses Gaus-sian Process Regression to learn a user preference function over songs. This function takes music metadata as inputs. This paper further introduces Kernel Meta-Training, which is a method of learning a Gaussian Process kernel from a distribution of functions that generates the learned function. For playlist gen-eration, AutoDJ learns a kernel from a large set of albums. This learned kernel is shown to be more effective at predicting users'playlists than a reasonable hand-designed kernel.
Article
This work contains a theoretical study and computer simulations of a new self-organizing process. The principal discovery is that in a simple network of adaptive physical elements which receives signals from a primary event space, the signal representations are automatically mapped onto a set of output responses in such a way that the responses acquire the same topological order as that of the primary events. In other words, a principle has been discovered which facilitates the automatic formation of topologically correct maps of features of observable events. The basic self-organizing system is a one- or two-dimensional array of processing units resembling a network of threshold-logic units, and characterized by short-range lateral feedback between neighbouring units. Several types of computer simulations are used to demonstrate the ordering process as well as the conditions under which it fails.
Article
A music hobbyist listens to different types of music at different times of the day. Thus, an automatic music playlist generator that can adjust to the hobbyist’s daily activities on this basis is necessary in order to generate the appropriate music to suit the user’s current activity, whether it is working or driving. Although existing research has introduced various music playlist generators, there is yet a system that generates the music playlist based on time. Hence, in this paper, we present a music playlist generation system, which provides an automatic and personalized music playing service based on the time parameter. This system represents the characteristics of music from features extracted out of both the music’s symbolic form and wave data. The kernel of this system is based on a modified artificial neural network. The user’s music rating history and the associated time stamps in the user’s profile constitute the training data of the modified artificial neural networks. A collaborative method has also been proposed to reduce the effect of the cold start problem upon system initialization. A series of experiments have been carried out to demonstrate the performance of this system.
Conference Paper
Sound and, specifically, music is a medium that is used for a wide range of purposes in different situations in very differ- ent ways. Ways for music selection and consumption may range from completely passive, almost unnoticed perception of background sound environments to the very specific se- lection of a particular recording of a piece of music with a specific orchestra and conductor on a certain event. Differ- ent systems and interfaces exist for the broad range of needs in music consumption. Locating a particular recording is well supported by traditional search interfaces via metadata. Other interfaces support the creation of playlists via artist or album selection, up to more artistic installation of sound environments that users can navigate through. In this paper we will present a set of systems that support the creation of as well as the navigation in musical spaces, both in the real world as well as in virtual environments. We show some common principles and point out further directions for a more direct coupling of the various spaces and interaction methods.
Conference Paper
The pervasion of digital music calls for novel techniques to search, retrieve and access music collections. Particularly mobile devices are, due to their limited display size and input capabilities, in a need of possibilities for intuitive and quick selection of music that go beyond mere browsing through song lists and directories. We propose a graphical user in- terface for mobile devices presenting a music map that or- ganizes a music collection automatically by sound similarity through audio analysis. This map provides an overview over large audio collections and offers several interaction possibil- ities to give users a quick and direct access to their music. It allows instant creation of playlists based on music of a desired genre by pointing on clusters or drawing paths on the map. The application not only eases access to music, but also enables novel application scenarios for collaborative music experience. The software has been implemented for a range of mobile devices, such as PDAs and smartphones. Categories and Subject Descriptors: H.5.2 (Informa-
Conference Paper
In this paper we propose a music Query by Humming System made of two main functional blocks; the first implements a voice-to-midi transcription algorithm to process the query, the second implements a search engine based on a novel template matching technique for Dynamic Time Warping. The voice-to-midi algorithm transforms the sung or hummed query in a MIDI file by segmenting and identifying the notes' sequence. The search engine uses a Template Matching technique to produce a list of possible melodies that best match the searched one. In the test phase, first, we investigated performance of the search engine in retrieval using a synthetic test bench; a set of artificial queries is build placing and adjusting, in the queries, patterns of typical disturbance. Second, we use a genetic algorithm to automatically optimize the performance of the overall system using a real-life test bench. Results highlight that the proposed MIR system has good performances and is robust enough to be employed in real life applications.