changed 2 years ago
Linked with GitHub

Music Popularity Prediction

Introduction

The music industry has grown tremendously by virtue of internet and streaming platforms. Factors that make success of songs therefore draw high attention.

Carlos et al. [1] predicted whether a song will be popular using both non-acoustic (e.g. artist, previous ranks) and acoustic data. Their results reach above 80% AUC and show that the performance was only 5.23% highter using acoustic features, indicating acoustic features may be neglectable.
Whereas Laura et al.[2] show that the Youtube view count and Spotify defined metric Popularity was found to be highly correlated to some high-level acoustic metrics provided by Spotify, such as energy, accousticness and instrumentalness. However, TikTok, the recently raising platform for hosting short-form video and widely believing pushing great impact on music industry, was turn out having little correlated to all of acoustic metrics.

The result of TikTok is so interesting since that the popular songs which highly associate with TikTok tend to be characterized by very catchy melody instead of popular singer or producer[3].

Community members often have their own "Heard it on TikTok" moment during their day to day lives. 72% of TikTokers agree that they associate certain songs with TikTok.[4]

Although TikTok actually provides streaming video rather than pure music, which may cause low correlated, the TikTok statement above indicate unpopluar music artists get more chance to pop out than before.

Considering the factors mentioned above, it seems to me that the debut of the song may be highly affected by extrinsic features of song, such as the popularity of artist, producer and promotion. But after a period of time, in long term aspect, the intrinsic(acoustic) features of a song still play an important role on popularity maintenance.


Targets

  • Verify hypothesis that mentioned in the last paragraph above.

  • Predict the music popularity (in time series) after the song be on the Spotify charts first time.

    • with low-level acoustic features (e.g. Spectromgram) only
      see if transformer can overwhelm the "semantic gap"[5]

      Generic low-level features of songs, like the mel-spetrogram used in and also throughout this work, may suffer from the “semantic gap” and cannot lead to an accurate prediction model for a high-level concept such as hotness.

  • Comparing the sample song from the set of so-called "TikTok song" between classic song.

How to Define Popularity

  • According to Lee at el.[6], popularity can be described in various perspectives.


  • We can either use stream or rank as criterion.

    • In my research, I have currently used stream value as determined metric.

Data Collection

  • track's statistics collected from Spotify: kworb.net

  • audio source:

    1. Spotfiy API preview (30s)

Methodology

Thought the ultimate goal is to predict popularity in time series (metioned in the MP3 section bellow), I started with training MP2 one as a milestone to MP3.

Model

  • Transformer

MP3 (Music Popularity Period Prediction)

  • with decoder
  • Predict popularity-time curve

MP2 (Music Popularity Prediction)

  • predict sumation[6:1] as an regression problem

  • fine-tuned MP2 model (Encoder) can be used in MP3 model.

  • Encoder ‐ AST(Audio Spectromgram Transformer)[7]

  • loss: MSE

Mixed Region Technique

  • Tracks in single region are not sufficient for training. Thus, train with multi-region tracks may be more resonable.
  • However, preference and population differ from regions. Region information have to be included.
    1. region cls_token
    2. region embeding
  • If the above method works, the region token or embeding may be parse to analyze differenct preference in different region. Further more, can be used for describing personal preference.

Misc.

Augmentation

  • use different preview clip of same tracks
    • typically, there are more than one track for a song on Spotify (since they published by different album), and they may not have same preview clip.
  • Specaugment
    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →
  • Rolling
    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →
  • Random Gain
    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →
  • pitch
  • timbre

Interesting Observation

Refs


  1. C. V. Soares Araujo, M. A. Pinheiro de Cristo and R. Giusti, "Predicting Music Popularity Using Music Charts," 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA), 2019, pp. 859-864, doi: 10.1109/ICMLA.2019.00149. ↩︎

  2. L. Colley et al., "Elucidation of the Relationship Between a Song's Spotify Descriptive Metrics and its Popularity on Various Platforms," 2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC), 2022, pp. 241-249, doi: 10.1109/COMPSAC54236.2022.00042. ↩︎

  3. research support not found yet ↩︎

  4. https://newsroom.tiktok.com/en-us/new-studies-quantify-tiktoks-growing-impact-on-culture-and-music ↩︎

  5. HIT SONG PREDICTION FOR POP MUSIC BY SIAMESE CNN WITH RANKING LOSS Lang-Chi Yu∗ , Yi-Hsuan Yang∗ , Yun-Ning Hung∗ , Yi-An Chen† (1710) ↩︎

  6. Lee, Junghyuk and Jong-Seok Lee. “Music Popularity: Metrics, Characteristics, and Audio-Based Prediction.” IEEE Transactions on Multimedia 20 (2018): 3173-3182. ↩︎ ↩︎

  7. Gong, Yuan and Chung, Yu-An and Glass, James. "AST: Audio Spectrogram Transformer" arXiv:2104.01778 ↩︎

Select a repo