Predicting Emotional Expression in Music Based on Musical Cues

Project in cognitive science trying to predict what the associated emotion will be based on the components used in the song. Objective is to be able to define what musical cues have the greatest impact on four different emotions; sad, peaceful, happy, scary. What features of a song contribute the most to the perceived emotion? A synthetic sample of 200 midi songs were generated based on 7 features:

  • Register: pitch of the music
  • Tempo: average number of notes per second
  • Soundlevel: Decibel
  • Articulation: Duration of note (legato to staccatissimo)
  • Timbre: Instrument brightness
  • Mode: Major or minor key
  • Structure: Samples of prototypical sad, peaceful, happy and scary songs

Songs were presented to 46 subjects, where they were asked to rate the perceived level of emotion for all the emotions – on a continuous scale. The following plot shows the highest rated song for all the emotions, plotted against the rating of the other emotions. It shows that the most prototypical scary song almost only perceived as scary. While the other prototypical songs are more ambiguous.

Next, we look at the semi-partial correlation of the different features, which should allow us to investigate what features contribute the most to an emotion. The three features with the highest explained variance for all emotions are reported here:

  • Scary: MelodyS3: 0.42, Register: 0.16, Mode: 0.08
  • Happy: Mode: 0.48, Tempo: 0.12, Register: 0.10
  • Sad: Mode: 0.54, Tempo: 0.22, Articulation: 0.04
  • Peaceful: Tempo: 0.22, MelodyS3: 0.17, Soundlevel: 0.15

To explore what features to use, and to what extent, when trying to recall specific emotions in music, we plot the feature impact against the perceived emotion.

The authors of the original research, predicted emotion by regression (Eerola, Tuomas 2013), but we sought to improve the prediction accuracy by using a non-linear function in the form of a simple 2-layered neural network. It proved to improve prediction accuracy for all emotions but one; peaceful. Below is a visualisation of the prediction, where all features are projected to two principal components.