Last.fm is a web site that tracks your music listening history across devices (computer, phone, iPod, etc) and services (Spotify, iTunes, Google Play, etc). I’ve been using Last.fm for nearly 10 years now, and my tracked listening history goes back even further when you consider all my pre-existing iTunes play counts that I scrobbled (ie, submitted to my Last.fm database) when I joined Last.fm.
Using Python, pandas, matplotlib, and leaflet, I downloaded my listening history from Last.fm’s API, analyzed and visualized the data, downloaded full artist details from the Musicbrainz API, then geocoded and mapped all the artists I’ve played. All of my code used to do this is available in this GitHub repo, and is easy to re-purpose for exploring your own Last.fm history. All you need is an API key.
First I visualized my most-played artists, above. Across the dataset, I have 279,769 scrobbles (aka, song plays). I’ve listened to 26,761 different artists and 66,377 different songs across 38,026 different albums from when I first started using iTunes circa 2005 through the present day. This includes pretty close to every song I’ve played on anything other than vinyl during that time.
I also mapped all the artists I’ve listened to. To do this, I took each artist ID in the Last.fm data set and passed it to the Musicbrainz API to get full artist details. Then I recursively queried the place until I got a full place name, like “Brixton, London, England, UK” (this process takes a while, and is perfectly suited to run on a Raspberry Pi!). Next I geocoded these place names to latitude-longitude using the Nominatim and Google APIs. Finally I mapped these points in Python with matplotlib basemap:
I also converted these points to GeoJSON to produce an interactive Leaflet web map of the artists I listen to (see this previous post for more on exporting pandas DataFrames to GeoJSON). Click any point in the map below to see a list of artists from there:
I predominately listen to artists from the populated areas of the U.S. and the blue banana. But, this map is not fully representative of all the artists I’ve played, because many of them lack a Musicbrainz ID on Last.fm, and many who do have an ID lack place information in the Musicbrainz database. This database also over-represents Western artists and artists listened to by Westerners, and under-represents other artists around the world (that appear in my listening history).
Last.fm trends over time
I was curious about my most-played artists’ relative performance over time. So, I took the top six artists and charted their cumulative play counts since 2009:
David Bowie is the big winner here, moving from sixth place in 2009 all the way up to first place today as my most-played artist (since Last.fm sign-up). Note that these disaggregate data differ slightly from the aggregate play counts per artist earlier, because Last.fm here discards the lump-sum play counts from iTunes that I first scrobbled upon signing up. The disaggregate scrobbles with dates thus only represent post-sign-up song plays.
I wanted to look more into these time dynamics of my listening history. How have they changed over the years? And, when exactly do I spend time listening to music? First I looked at my songs played per month since January 2010:
Although there are a couple of big spikes, during most months I listen to somewhere around 1,000 to 2,500 songs. The peak during March 2015 coincides with my doctoral qualifying exams, which saw me sitting in my room about 16 hours day reading, writing… and listening to music. Next I looked at which days of the week I do most of my listening:
So, I listen to the most music on Fridays, and the least on Saturdays. The weekdays are all consistently higher as I tend to listen to music all day long while I’m working. The weekends are consistently lower as I tend to be out and about more, away from my computer and stereo. Next I looked at my cumulative listening history by hour of the day:
This chart essentially follows my sleep, wake, work schedule. Most of my listening occurs during the mid-day while I’m working and tails off into the evening. But this aggregate pattern isn’t exactly same each day. Here I broke out the hourly chart above, by each day of the week:
Now it’s easy to see the low days of Saturday and Sunday – but interestingly, Saturday has my highest play count late at night, when I’m up late PARTYING. Fridays at noon and Wednesdays at 3pm have the highest peaks. Also, my Monday mornings get off to slower starts than the rest of the work days, as I recover from the weekend.
Artist names
For yuks, I looked at a couple traits of artist names. The first is the frequency of artist names beginning with each letter of the alphabet (sans a preceding “the”):
S’s and M’s lead the pack, and Q’s and X’s bring up the rear. Next I looked at the frequency of artist name lengths:
Not everyone can be a name length outlier like X and Orchestral Manoeuvres in the Dark.
Top songs and albums on Last.fm
Finally, I’ll wrap this up similarly to how I started it by visualizing my most-played songs and albums of all-time on Last.fm. First, my most-played tracks:
And lastly, my most-played albums:
There are some common themes here: similar artists appear in both the most-played songs and most-played albums lists, unsurprisingly. There’s also a pretty clear correlation between the most-played albums and the number of tracks on the album, as these data are not normalized by the latter. It might be possible to normalize album play counts by number of tracks on the album, by querying the MusicBrainz API for more information.
To recap, I downloaded my listening history from Last.fm, analyzed and visualized the data, downloaded full artist details from the Musicbrainz API, then geocoded and mapped all the artists I’ve played, and finally dumped these points to GeoJSON for leaflet web mapping. All of my Python and leaflet code used to do this is available in this GitHub repo, and is easy to re-purpose for exploring your own Last.fm history.
You might also be interested in:
- Running scientific Python on a Raspberry Pi
- Exporting pandas DataFrames to GeoJSON
- Mapping everywhere I’ve ever been in my life
- Our course at UC Berkeley that teaches these skills and tools
8 replies on “Analyzing Last.fm Listening History”
Not a fan of Queen, eh? :)
I’ve been meaning to do something similar with my own scrobbles, though my listening history is shorter. (And I also listen to the radio a lot, especially on weekends.) Great work!
Would love to see some data on the number of songs per artist. Do you listen to a few Kinks songs on repeat, but the entirety of David Bowie’s work? Are there a lot of 1 hit wonders you listen to?
Nice post – thanks for sharing!
What I didn’t get is how to use the lastfm username and API key:
from keys import lastfm_api_key as key, lastfm_user_name as username
ImportError: No module named keys
Thanks
Ah yeah, I just use that convention to conceal my personal keys from the world. The simple fix is to delete that one import line and replace it with two new lines: 1) key=YOUR-LASTFM-API-KEY …and 2) username=YOUR-LASTFM-USERNAME. Obviously replace those placeholder values with your actual values. I updated the IPython notebook on GitHub with this info.
This page explains how to convert from notebook to a .py script:
https://ipython.org/ipython-doc/1/interactive/nbconvert.html
Oh, this is going to be fun. Thanks for sharing!
[…] Thanks to Brian whose blog post inspired me to do this, and thanks to Geoff Boeing, whose work finding locations of artists really helped, if he ever comes across this […]
[…] global analysis. Maybe at some point I will try to re-create what others have done with profiles ([1], [2], [3]) or take a closer look at existing third-party services ([4], […]
Awesome post – can’t wait to play with this! Thanks for sharing!