Categories
Data

Clustering to Reduce Spatial Data Set Size

Read/cite the paper here.

In this tutorial, I demonstrate how to reduce the size of a spatial data set of GPS latitude-longitude coordinates using Python and its scikit-learn implementation of the DBSCAN clustering algorithm. All my code is in this IPython notebook in this GitHub repo, where you can also find the data.

Traditionally it’s been a problem that researchers did not have enough spatial data to answer useful questions or build compelling visualizations. Today, however, the problem is often that we have too much data. Too many scattered points on a map can overwhelm a viewer looking for a simple narrative. Furthermore, rendering a JavaScript web map (like Leaflet) with millions of data points on a mobile device can swamp the processor and be unresponsive.

Categories
Data

Reverse Geocode a Set of Lat-Long Coordinates to City + Country

This tutorial demonstrates how to reverse geocode a set of latitude-longitude coordinates to city and country using Python and the Google Maps API.

I have previously written about my GPS location data from this summer’s travels. The data set, gathered with the OpenPaths app, contains lat-long coordinates and timestamps. Without city or country data, any visualizations would be very simplistic because all I have is coordinates and timestamps. It would be nice to reverse geocode these coordinates to add city and country data to each point. Then, I could create richer, more informative marker popups that include this new geographical information.