The problem

I like listening to music while I’m programming. However, I have lots of music on my computer and I don’t really like all of it (or find it appropriate as background music for when I’m programming).

I guess I could spend a few evenings going through my music collection, deleting anything I don’t like (which isn’t always such an easy decision), creating a playlist or whatever. This isn’t a permanent solution, though, since with the time my collection will continue growing, I may get sick of hearing some songs, my taste may change, etc. Also, I prefer spending my time programming, reading or doing anything else more interesting than sorting my music collection.

Unrelated to the problem itself, let me also mention that I use different media players. Currently I’m mostly using Banshee, but I’ve been intermittently toggling between it and Rhythmbox those last months, and also used a command-line player for some time. I’m saying this because I expect a optimal solution to be media player-independent. The bigger problem of each media player having its own database is also something I’d like to see addressed, maybe with some technology like Tracker, but that is another topic.

The status quo

My current way to approach this problem is basically setting my media player to choose songs randomly, and using the big “Next >>|” button on my keyboard whenever it chooses some song which annoys me.

(Banshee’s playback list interface has an option to automatically fill itself sorting songs by popularity, but it doesn’t seem to work good at all here; I also recall Amarok having some automatic playlist generation options, but I’m not using any KDE applications anymore and in any case this is outside the scope of an application-neutral solution).

The solution

It has recently occurred to me that a neat solution for this would be to gather information from a generic event log and to translate that into a numerical punctuation for each song. I think you may already guess where I’m heading, but if not: Zeitgeist!

With the appropriate data-sources installed, Zeitgeist holds information on which songs started playing automatically (because they are on a playlist or because you have your media player set to random), which were started manually, and in both cases which you listened to completely and which ones you skipped (and how long you resisted listening to them). It shouldn’t be too difficult to write a script which will periodically request this information from Zeitgeist and give songs positive points for every time you listened to them completely (extra if you chose them yourself manually) and negative points to the ones you skipped (but giving less negative points if you resisted half of the song than if you skipped it right after you recognized it).

With this punctuation information, music players can avoid playing songs you don’t like and give you only those you like or new ones for which there isn’t any information yet (if it followed the punctuation strictly this would end with the same songs being played all the time and new songs with punctuation around 0 being ignored). The importance of the play/skip actions would decay over time (it’s more important to consider whether you listened to the song yesterday than if you did six months ago), etc., etc.

If we wanted to create something really fancy we could even look at generating different ratings for separate circumstances, eg. in case you like listening to a different sort of music during the morning than during the evening, or to differentiate between what you like to hear while you are coding and while the computer is idle (maybe because you are doing paper homework and only using the computer to get some background music). The information for all this is there in Zeitgeist, so it’s only a matter of writing a good algorithm.

The implementation

I’ve already explained most of how this should work in the previous section, but here’s a bit of an overview of what’s needed for this:

  • Data-sources inserting the music reproduction information. We already have a data-source for Rhythmbox implemented as an extension, but Banshee and any other players are missing.
  • The actual algorithm, probably implemented as a periodically run script leaving the aggregated information at some accessible place, although this may vary depending on the degree of fanciness you choose.
  • The interface, ie. plugins for Rhythmbox and other media players which take that information and use it to provide an option for semi-randomly choosing music excluding stuff you don’t like.

If I’ve got you interested on this, I’m willing to mentor someone on this, so get in touch! Feel free to jump into #zeitgeist on irc.freenode.net or drop me a mail.