Fig 1 - uploaded by Jianyu Fan
Content may be subject to copyright.
Number of Weeks before Becoming Top10 Hits (UK Charts)  

Number of Weeks before Becoming Top10 Hits (UK Charts)  

Source publication
Conference Paper
Full-text available
The top 40 chart is a popular resource used by listeners to select and purchase music. Previous work on automatic hit song prediction focused on Western pop music. However, pop songs from different parts of the world exhibit significant differences. We performed experiments on hit song prediction using 40 weeks of data from Chinese and UK pop music...

Contexts in source publication

Context 1
... check whether it is better to predict the next week's data and the second next week's data, we counted the number of weeks hit songs take before becoming a top10 hit. Figure 1 and Figure2show the distributions of number of weeks of UK and Chinese hit songs. ...
Context 2
... the following figures, the red line indicates the UK top 5 songs and the blue line indicates the Chinese top 5 songs. Figure 12, it is obvious to see that values of danceability, energy, tempo and speechiness of the UK top 5 songs are higher than those of the Chinese top 5 songs. We can generalize that Chinese hit songs are more melodic and less energetic and much less songs are suitable for dance parties. ...

Similar publications

Conference Paper
Full-text available
Audio recordings and the corresponding transcripts are often used as prosthetic memory (PM) after meetings and lectures. While current research is mainly developing novel features for prosthetic memory, less is known on how and why audio recordings and transcripts are used. We investigate how users interact with audio and transcripts as prosthetic...
Preprint
Full-text available
In this paper, we propose to infer music genre embeddings from audio datasets carrying semantic information about genres. We show that such embeddings can be used for disambiguating genre tags (identification of different labels for the same genre, tag translation from a tag system to another, inference of hierarchical taxonomies on these genre tag...
Article
Full-text available
The article presents a method for segmentation of ethnomusicological field recordings. Field recordings are integral documents of folk music performances captured in the field, and typically contain performances, intertwined with interviews and commentaries. As these are live recordings, captured in non-ideal conditions, they usually contain signif...
Conference Paper
Full-text available
This paper presents a method for recognizing musical instruments in user-generated videos. Musical instrument recognition from music signals is a well-known task in the music information retrieval (MIR) field, where current approaches rely on the analysis of the good-quality audio material. This work addresses a real-world scenario with several res...
Article
Full-text available
Within the last 15 years, the field of Music Information Retrieval (MIR) has made tremendous progress in the development of algorithms for organizing and analyzing the ever-increasing large and varied amount of music and music-related data available digitally. However, the development of content-based methods to enable or ameliorate multimedia retr...

Citations

... In a cross-cultural exploration, A research study delved into Chinese and UK hit song prediction, providing a unique perspective on regional variations [12]. This research ventured beyond the conventional scope and brought to the front the significance of cultural preferences in shaping the dynamics of music popularity. ...
... Collectively, these studies underscore the multifaceted nature of hit song prediction, incorporating diverse methodologies, audio features, and regional considerations. While challenges and limitations persist, including data availability and feature selection, these research endeavors have covered the way for future investigations that aim to refine machine learning techniques and enhance the accuracy of hit song prediction [11], [12]. The music industry continues to evolve, and the insights gleaned from these studies offer valuable guidance for stakeholders seeking to navigate the dynamic landscape of musical success. ...
Conference Paper
In the era of evolving music consumption, this systematic literature review researches the realm of predictive analytics for music streaming, specifically targeting Spotify's stream count prediction in Sri Lanka through machine learning methodologies. With streaming platforms shaping the music industry landscape, accurately predicting song popularity becomes essential for artists, producers, and industry stakeholders. This review analyzes global studies on machine learning's application in forecasting stream counts while defining their methodologies and outcomes. It intricately examines diverse machine-learning methodologies employed in prior research endeavors. Ranging from regression models and ensemble techniques to deep learning architectures, the spectrum of methodologies used in forecasting stream counts on music streaming platforms is elucidated. Noteworthy techniques such as support vector machines (SVM), random forests, and recurrent neural networks (RNNs) have demonstrated efficacy in capturing intricate patterns within music data for predictive analysis. Our paper highlights the significance of feature engineering and selection methods, underscoring their pivotal role in enhancing the accuracy of predictive models. Through this comprehensive study, this review aims to expose specific gaps in stream count prediction models tailored to Sri Lanka's varied music preferences and consumption habits. By illuminating these gaps, it aspires to stimulate future research endeavors focused on refining predictive models, ultimately empowering the Sri Lankan music industry with more insights for better strategic decision-making.
... Another interesting study [9] sought to compare the differences in terms of audio features between hit songs in two distinct markets, the Chinese and the United Kingdom. The importance of features was identified using timeweighted linear regression and the comparison showed that Chinese hit songs, in general, are more melodic, slower, and less energetic relative to the UK hit songs. ...
... In contrast to audio, lyrical, and meta-data modalities. Fan et al. [1] in 2013 worked on predicting whether a song will be a hit or not, analyzing Chinese and British songs. They then compared them and how different factors influence different regional songs on whether the song will be a hit or not. ...
Conference Paper
The music industry has grown from independent composers releasing their self-published work to an entire extensive industry wherein an enormous amount of money is at stake depending on whether or not a particular song will be a hit on the charts. As a result, the music production industry has become more digitized and datacentric in order to try and find a mechanism for predicting a hit song. Extensive research work has been done to predict the salability of a particular song being put out by composers. However, the current research has been limited to Western English language songs only. Due to the lack of datasets and preliminary research on regional language songs, hit prediction for regional language songs has not been given any attention in the existing state of the art. Therefore, in this research paper, we predict the profitability and salability of a Hindi language song by predicting whether or not the song will be a hit or in other words whether or not the song will be listed on top music charts. High-level features of a song have been incorporated into our models as parameters that will affect the profitability of the song in question. To demonstrate this, we created a new dataset of Hindi songs. Moreover, we compared the results of various algorithms such as Random Forrest, Gaussian Naive Bayes, Multi-layer Perceptron (MLP), AdaBoost, and Logistic Regression optimized using SGD. Our results show that MLP achieved the best outcomes with a precision of 0.9429, recall of 0.8875, and F1 score of 0.8849.
... Early hit song prediction studies illustrate the complexity of this problem, delivering only weak classification results [3,4,5,6]. In recent years, more advanced approaches have been able to accurately predict hits and non-hits using audio features [7,8,9,10,11,12,13]; however, many other potentially useful sources of information about the songs are also available. In this study, we employ 12 Spotify audio features (energy, liveness, tempo, speechiness, acousticness, time signature, key, duration ms, loudness, valence, mode and danceability), these features are drawn directly from Spotify, together with novel features based on Billboard music metadata (popularity continuity, genre class and title topic), as well as the topics extracted from the songs' lyrics to identify Top 10 hits among Top 100 hits. ...
... Various algorithms have been applied to tackle this task, among them: Logistic Regression (LR), Support Vector Machine (SVM) and Neural Networks (NN) are commonly used [3,6,8,11]. Ni et al. [7] gained promising results in predicting UK Top 5 hits on the Top 40 single song charts, but again little implementation detail was provided. ...
... Ni et al. [7] gained promising results in predicting UK Top 5 hits on the Top 40 single song charts, but again little implementation detail was provided. Fan and Casey [11] used LR and SVM models to predict British and Chinese hit songs but found that audio features worked better for predicting Chinese hits than British ones, and that textual features worked best overall. ...
Preprint
Full-text available
Hit song prediction, one of the emerging fields in music information retrieval (MIR), remains a considerable challenge. Being able to understand what makes a given song a hit is clearly beneficial to the whole music industry. Previous approaches to hit song prediction have focused on using audio features of a record. This study aims to improve the prediction result of the top 10 hits among Billboard Hot 100 songs using more alternative metadata, including song audio features provided by Spotify, song lyrics, and novel metadata-based features (title topic, popularity continuity and genre class). Five machine learning approaches are applied, including: k-nearest neighbours, Naive Bayes, Random Forest, Logistic Regression and Multilayer Perceptron. Our results show that Random Forest (RF) and Logistic Regression (LR) with all features (including novel features, song audio features and lyrics features) outperforms other models, achieving 89.1% and 87.2% accuracy, and 0.91 and 0.93 AUC, respectively. Our findings also demonstrate the utility of our novel music metadata features, which contributed most to the models' discriminative performance.
... Dhanaraj & Logan [13] and Herremans et al [15] both try to predict the formulaic nature of a hit the former through acoustic parameters and the latter through rhythmic parameters, making for an interesting spread of factors to choose from. Fan et al [14] and Prey et al [26] both talk about the cultural side of the story. The former comparing and contrasting the Chinese and US charts, and the latter talking about the role of streaming services in particular Spotify (the progenitor of our data) have on the perception of a song. ...
... Hence, to make robust predictions in Hit Song Science, musicological features became a part of the data being used as input for the prediction. Musicological features could include lyrical data [13,10], moods and emotions portrayed through the music [25], the actual audio of the song itself [20,23] and, more recently, objective metadata about the audio itself, such as the Echo Nest data, obtained from Spotify's API [8,14,15] -that this paper utilizes too. ...
... -Talking about the multi-class classifiers, our random forest model perform with an accuracy of 56.2% and a precision of 54%. Fan and Casey perform the same experiment with a support vector machine, and achieve an accuracy of 56% on a similar dataset [14]. Additionally, Datla and Vishnu perform a similar multi-class experiment on lyrical data where they achieve predictions with a 79% precision [10]. ...
Preprint
The music industry is a multi-billion-dollar industry and to be able to accurately predict whether a song would catch the pulse of the audience will have a huge economic impact on the industry. The Billboard magazine is a world renowned music publication since 1984 and it releases the weekly ranking of the top 100 songs in various categories such as rock, pop, hip-hop, etc. Several studies have determined that it is possible to predict the approximate bucket of ranks that a song is likely to chart in using social and subjective indicators. The definitions of these indicators however can change over time, thus rendering the previous classifications erroneous. Furthermore, there is a lack of experimentation to predict actual ranks of the songs. Here, we report successful results from our experiments in predicting the ranks and the number of weeks the songs are likely to stay on the charts, using objective and well-defined features, obtained from Spotify's Web API. This paper analyzes the 19581 songs that have featured on Billboard's Hot 100 charts from December 12, 1970 to June 21, 2018 using 21 features from the Spotify API. It extends existing research about classifying songs into rank buckets of Top-10 and Top-40 using these objective features, and also demonstrates that it is possible to predict exact ranks of the songs within a root-mean-squared-error of 28 ranks and the number of weeks of charting within a root-mean-squared-error of 7 weeks. Further, this paper can demonstrate definitive trends between individual features and the ranks of the songs. Finally, it is shown that objective metadata about the songs serve as good indicators about the trends in the Billboard charts, and can be used to predict a song's performance on the charts within acceptable error rates.
... Researchers have also compared Chinese and Western pop songs. Fan and Casey performed Chinese and UK hit song prediction using 40 weeks of data from pop music charts [17]. They found danceability, energy, tempo and speechiness of UK hits are significantly higher than those of Chinese hits. ...
... In [15], a set of experiments for HSS prediction is presented using data collected from Chinese and UK pop music charts. In these experiments, a set of audio features are selected to predict whether a song will be a hit or not. ...
Article
Full-text available
The continuous evolution of multimedia applications is fostering applied research in order to dynamically enhance the services provided by platforms such as Spotify, Lastfm, or Billboard. Thus, innovative methods for retrieving specific information from large volumes of data related with music arises as a potential challenge within the Music Information Retrieval (MIR) framework. Moreover, despite the existence of several musical-based datasets, there is still a lack of information to properly assess an accurate estimation of the impact or the popularity of a song within a platform. Furthermore, the aforementioned platforms measure the popularity in various manners, thus increasing the difficulties in performing generalized and comparable models. In this paper, the creation of SpotGenTrack Popularity Dataset (SPD) is presented as an alternative solution to existing datasets that will facilitate researchers when comparing and promoting their models. In addition, an innovative multimodal end-to-end Deep Learning architecture named as HitMusicNet is presented for predicting popularity in music recordings. Experiments conducted show that the proposed architecture outperforms previous studies in the State-of-the-Art by incorporating three main modalities to the analysis, such as audio, lyrics and meta-data as well as a preliminary compression stage via autoencoder to better the capability of the model when predicting the popularity.
... Researchers have also compared Chinese and Western pop songs. Fan and Casey performed Chinese and UK hit song prediction using 40 weeks of data from pop music charts [17]. They found danceability, energy, tempo and speechiness of UK hits are significantly higher than those of Chinese hits. ...
Preprint
Whether literally or suggestively, the concept of soundscape is alluded in both modern and ancient music. In this study, we examine whether we can analyze and compare Western and Chinese classical music based on soundscape models. We addressed this question through a comparative study. Specifically, corpora of Western classical music excerpts (WCMED) and Chinese classical music excerpts (CCMED) were curated and annotated with emotional valence and arousal through a crowdsourcing experiment. We used a sound event detection (SED) and soundscape emotion recognition (SER) models with transfer learning to predict the perceived emotion of WCMED and CCMED. The results show that both SER and SED models could be used to analyze Chinese and Western classical music. The fact that SER and SED work better on Chinese classical music emotion recognition provides evidence that certain similarities exist between Chinese classical music and soundscape recordings, which permits transferability between machine learning models.
... This paper is an extension of a previous work [4] that used deep learning for hit song prediction from audio, which had been rarely attempted in the literature. Many previous works viewed hit song prediction as a regression (or rating) or classification problem with various approaches, including SVM classifiers based on latent topic features from audio and lyrics [5] or on human-annotated tags [6], Bayesian network based on lyric features only [7], and time weighted linear regression [8]. Different input features of songs other than audio and lyrics were also used, e.g. ...
Article
Full-text available
A model for hit song prediction can be used in the pop music industry to identify emerging trends and potential artists or songs before they are marketed to the public. While most previous work formulates hit song prediction as a regression or classification problem, we present in this paper a convolutional neural network (CNN) model that treats it as a ranking problem. Specifically, we use a commercial dataset with daily play-counts to train a multi-objective Siamese CNN model with Euclidean loss and pairwise ranking loss to learn from audio the relative ranking relations among songs. Besides, we devise a number of pair sampling methods according to some empirical observation of the data. Our experiment shows that the proposed model with a sampling method called A/B sampling leads to much higher accuracy in hit song prediction than the baseline regression model. Moreover, we can further improve the accuracy by using a neural attention mechanism to extract the highlights of songs and by using a separate CNN model to offer high-level features of songs.
... Still, MER has several challenges. First, music perception can be dramatically different if listeners are from different regions of the world and have various unique cultural backgrounds [5,18]. Second, it is difficult for researchers to collect ground truth data to cover a wide range of population that well distributed in different parts of the world [5]. ...
Conference Paper
Full-text available
Emotion recognition is an open problem in Affective Computing the field. Music emotion recognition (MER) has challenges including variability of musical content across genres, the cultural background of listeners, reliability of ground truth data, and the modeling human hear- ing in computational domains. In this study, we focus on experimental music emotion recognition. First, we present a music corpus that contains 100 experimental music clips and 40 music clips from 8 musical genres. The dataset (the music clips and annotations) is publicly available at: http://metacreation.net/project/emusic/. Then, we present a crowdsourcing method that we use to collect ground truth via ranking the valence and arousal of music clips. Next, we propose a smoothed RankSVM (SRSVM) method. The evaluation has shown that the SRSVM outperforms four other ranking algorithms. Finally, we ana- lyze the distribution of perceived emotion of experimental music against other genres to demonstrate the difference between genres