Categories
Data

Mapping Your Google Location History with Python

Small map of my Google location history data in the San Francisco Bay Area, 2012-2016I recently wrote about visualizing my Foursquare check-in history and it inspired me to map my entire Google location history data – about 1.2 million GPS coordinates from my Android phone between 2012 and 2016. I used Python and its pandas, matplotlib, and basemap libraries. The Python code is available in this notebook in this GitHub repo, and it’s simple to re-use to visualize your own location history.

Just download your JSON file from Google then run the code. First I load the JSON file and parse the latitude, longitude, and timestamp with pandas. Then I map my worldwide data set:

Map of my Google location history data worldwide, 2012-2016

Categories
Data

Analyzing Last.fm Listening History

Last.fm is a web site that tracks your music listening history across devices (computer, phone, iPod, etc) and services (Spotify, iTunes, Google Play, etc). I’ve been using Last.fm for nearly 10 years now, and my tracked listening history goes back even further when you consider all my pre-existing iTunes play counts that I scrobbled (ie, submitted to my Last.fm database) when I joined Last.fm.

Using Python, pandas, matplotlib, and leaflet, I downloaded my listening history from Last.fm’s API, analyzed and visualized the data, downloaded full artist details from the Musicbrainz API, then geocoded and mapped all the artists I’ve played. All of my code used to do this is available in this GitHub repo, and is easy to re-purpose for exploring your own Last.fm history. All you need is an API key.

Last.fm artists played the most

First I visualized my most-played artists, above. Across the dataset, I have 279,769 scrobbles (aka, song plays). I’ve listened to 26,761 different artists and 66,377 different songs across 38,026 different albums from when I first started using iTunes circa 2005 through the present day. This includes pretty close to every song I’ve played on anything other than vinyl during that time.

Categories
Data

Visualize Foursquare Location History

I started using Foursquare at the end of 2012 and kept with it even after it became the pointless muck that is Swarm. Since I’ve now got 4 years of location history (ie, check-ins) data, I decided to visualize and map it with Python, matplotlib, and basemap. The code is available in this GitHub repo. It’s easy to re-purpose to visualize your own check-in history: you just need to plug in your Foursquare OAuth token then run the notebook.

First the notebook downloads all my check-ins from the Foursquare API. Then I mapped all of them, using matplotlib basemap.

Map of Foursquare Swarm check-in location history

Categories
Academia

Urban Informatics and Visualization at UC Berkeley

The fall semester begins next week at UC Berkeley. For the third year in a row, Paul Waddell and I will be teaching CP255: Urban Informatics and Visualization, and this is my first year as co-lead instructor.

This masters-level course trains students to analyze urban data, develop indicators, conduct spatial analyses, create data visualizations, and build Paris open datainteractive web maps. To do this, we use the Python programming language, open source analysis and visualization tools, and public data.

This course is designed to provide future city planners with a toolkit of technical skills for quantitative problem solving. We don’t require any prior programming experience – we teach this from the ground up – but we do expect prior knowledge of basic statistics and GIS.

Update, September 2017: I am no longer a Berkeley GSI, but Paul’s class is ongoing. Check out his fantastic teaching materials in his GitHub repo. From my experiences here, I have developed a course series on urban data science with Python and Jupyter, available in this GitHub repo.

Categories
Data

Map Projections That Lie

How big is Greenland? It’s huge, right? At 836,109 square miles in size, Greenland is the largest island and the 12th largest country on Earth. With only 56,000 people living in that enormous area (80% of which is covered by the world’s only extant ice sheet outside of Antarctica), it is also the least densely populated country on Earth.

You can get a sense of how large Greenland is when you look at a map of the world:

world map mercator projection

It’s huge! Greenland is bigger than the entire continent of Africa! Or is it? The map above uses the common Mercator projection to project the 3-D surface of the Earth onto a 2-D surface suitable for a paper map or an image on your computer screen. But it’s not easy to project the curved surface of a sphere onto a rectangular plane. Compromises must be made. In the case of the Mercator projection, the compromise is that objects’ sizes become increasingly distorted the further they are from the equator. At the poles, the scale and distortion become infinite.

Categories
Data

Visualizing Summer Travels Part 6: Projecting Spatial Data with Python

This post is part of a series on visualizing data from my summer travels.

I’ve previously discussed visualizing the GPS location data from my summer travels with CartoDB, Leaflet, and Mapbox + Tilemill. I also visualized different aspects of this data set in Python, using the matplotlib plotting library. However, these spatial scatter plots used unprojected lat-long data which looked pretty distorted at European latitudes.

Today I will show how to convert this data into a projected coordinate reference system and plot it again using matplotlib. These projected maps will provide a much more accurate spatial representation of my spatial data and the geographic region. All of my code is available in this GitHub repo, particularly this notebook.

Categories
Data

Using geopandas on Windows

projected-shapefile-gps-coordinatesThis guide was written in 2014 and updated slightly in November 2020.

I recently went through the exercise of installing geopandas on Windows. Having learned several valuable lessons, I thought I’d share them with the world in case anyone else is trying to get this toolkit working in a Windows environment. It seems that pip installing geopandas usually works fine on Linux and Mac. However, several of its dependencies have C extensions that can cause compilation failures with pip on Windows. This guide gets around that issue.

Categories
Data

Visualizing Summer Travels

projected-shapefile-gps-coordinatesThis is a series of posts about visualizing spatial data. I spent a couple of months traveling in Europe this summer and collected GPS location data throughout the trip with the OpenPaths app. I explored different web mapping technologies such as CartoDB, Leaflet, Mapbox, and Tilemill to plot my travels. I also used Python and matplotlib to run some descriptive statistics and visualize other aspects of my trip.

Here is the series of posts:

My Python code is available in this GitHub repo. I also did some more involved work under the hood to prep the data and support these visualizations. For example, in the following posts I reverse-geocoded the spatial data set and reduced its size with clustering algorithms and the Douglas-Peucker algorithm: