Exporting Python Data to GeoJSON

I like to do my data wrangling and analysis work in Python, using the pandas library. I also use Python for much of my data visualization and simple mapping. But for interactive web maps, I usually use Leaflet. There isn’t dead-simple way to dump a pandas DataFrame with geographic data to something you can load with Leaflet. You could use GeoPandas to convert your DataFrame then dump it to GeoJSON, but that isn’t a very lightweight solution.

So, I wrote a simple reusable function to export any pandas DataFrame to GeoJSON:

Continue reading Exporting Python Data to GeoJSON

Animated 3-D Plots in Python

Download/cite the paper here!

In a previous post, I discussed chaos, fractals, and strange attractors. I also showed how to visualize them with static 3-D plots. Here, I’ll demonstrate how to create these animated visualizations using Python and matplotlib. All of my source code is available in this GitHub repo. By the end, we’ll produce animated data visualizations like this, in pure Python:

Animated 3-D Poincare plot of the logistic map's chaotic regime - this is time series data embedded in three dimensional state space Continue reading Animated 3-D Plots in Python

Visualizing Chaos and Randomness

3-D Poincare plot of the logistic map's chaotic regime - this is time series data embedded in three dimensional state space

Download/cite the paper here!

In a previous post, I discussed chaos theory, fractals, and strange attractors – and their implications for knowledge and prediction of systems. I also briefly touched on how phase diagrams (or Poincaré plots) can help us visualize system attractors and differentiate chaotic behavior from true randomness.

In this post (adapted from this paper), I provide more detail on constructing and interpreting phase diagrams. These methods are particularly useful for discovering deterministic chaos in otherwise random-appearing time series data, as they visualize strange attractors. I’m using Python for all of these visualizations and the source code is available in this GitHub repo.

Continue reading Visualizing Chaos and Randomness

Chaos Theory and the Logistic Map

Logistic map bifurcation diagram showing the period-doubling path to chaosUsing Python to visualize chaos, fractals, and self-similarity to better understand the limits of knowledge and prediction. Download/cite the article here and try pynamical yourself.

Chaos theory is a branch of mathematics that deals with nonlinear dynamical systems. A system is just a set of interacting components that form a larger whole. Nonlinear means that due to feedback or multiplicative effects between the components, the whole becomes something greater than just adding up the individual parts. Lastly, dynamical means the system changes over time based on its current state. In the following piece (adapted from this article), I break down some of this jargon, visualize interesting characteristics of chaos, and discuss its implications for knowledge and prediction.

Chaotic systems are a simple sub-type of nonlinear dynamical systems. They may contain very few interacting parts and these may follow very simple rules, but these systems all have a very sensitive dependence on their initial conditions. Despite their deterministic simplicity, over time these systems can produce totally unpredictable and wildly divergent (aka, chaotic) behavior. Edward Lorenz, the father of chaos theory, described chaos as “when the present determines the future, but the approximate present does not approximately determine the future.”

Continue reading Chaos Theory and the Logistic Map

Visualizing Summer Travels Part 6: Projecting Spatial Data with Python

This post is part of a series on visualizing data from my summer travels.

I’ve previously discussed visualizing the GPS location data from my summer travels with CartoDB, Leaflet, and Mapbox + Tilemill. I also visualized different aspects of this data set in Python, using the matplotlib plotting library. However, these spatial scatter plots used unprojected lat-long data which looked pretty distorted at European latitudes.

Today I will show how to convert this data into a projected coordinate reference system and plot it again using matplotlib. These projected maps will provide a much more accurate spatial representation of my spatial data and the geographic region. All of my code is available in this GitHub repo, particularly this notebook.

Continue reading Visualizing Summer Travels Part 6: Projecting Spatial Data with Python

Using geopandas on Windows

projected-shapefile-gps-coordinatesThis guide was updated in June 2016 to reflect changes to the dependencies and the ability to install with Python wheels.

I recently went through the exercise of installing geopandas on Windows and getting it to run. Having learned several valuable lessons, I thought I’d share them with the world in case anyone else is trying to get this toolkit working in a Windows environment (also see this GitHub gist I put together).

It seems that pip installing geopandas works fine on Linux and Mac. However, several of its dependencies have C extensions that cause compilation failures with pip on Windows. This guide gets around that issue. For preliminaries, I have this working on Windows 7, 8, and 10. My Python environments are Anaconda, 64-bit, with both Python 2.7 and 3.5. I’m running geopandas version 0.2 with GDAL 2.0.2, Fiona 1.7.0, pyproj 1.9.5.1, and shapely 1.5.16.

Continue reading Using geopandas on Windows

Visualizing Summer Travels

projected-shapefile-gps-coordinatesThis is a series of posts about visualizing spatial data. I spent a couple of months traveling in Europe this summer and collected GPS location data throughout the trip with the OpenPaths app. I explored different web mapping technologies such as CartoDB, Leaflet, Mapbox, and Tilemill to plot my travels. I also used Python and matplotlib to run some descriptive statistics and visualize other aspects of my trip.

Here is the series of posts:

My Python code is available in this GitHub repo. I also did some more involved work under the hood to prep the data and support these visualizations. For example, in the following posts I reverse-geocoded the spatial data set and reduced its size with clustering algorithms and the Douglas-Peucker algorithm:

Continue reading Visualizing Summer Travels

Visualizing Summer Travels Part 5: Python + Matplotlib

This post is part of a series on visualizing data from my summer travels.

I’ve previously discussed visualizing the GPS location data from my summer travels with CartoDB, Leaflet, and Mapbox + Tilemill. Today I will explore visualizing this data set in Python, using the matplotlib plotting library. All of my code is available in this GitHub repo, particularly this notebook.

Continue reading Visualizing Summer Travels Part 5: Python + Matplotlib

Reducing Spatial Data Set Size with Douglas-Peucker

In a previous post I discussed how to reduce the size of a spatial data set by clustering. Too many data points in a visualization can overwhelm the user and bog down on-the-fly client-side map rendering (for example, with a javascript tool like Leaflet). So, I used the DBSCAN clustering algorithm to reduce my data set from 1,759 rows to 158 spatially-representative points. This series of posts discusses this data set in depth.

Continue reading Reducing Spatial Data Set Size with Douglas-Peucker

Clustering to Reduce Spatial Data Set Size

In this tutorial, I demonstrate how to reduce the size of a spatial data set of GPS latitude-longitude coordinates using Python and its scikit-learn implementation of the DBSCAN clustering algorithm. All my code is in this IPython notebook in this GitHub repo, where you can also find the data.

Traditionally it’s been a problem that researchers did not have enough spatial data to answer useful questions or build compelling visualizations. Today, however, the problem is often that we have too much data. Too many scattered points on a map can overwhelm a viewer looking for a simple narrative. Furthermore, rendering a JavaScript web map (like Leaflet) with millions of data points on a mobile device can swamp the processor and be unresponsive.

Continue reading Clustering to Reduce Spatial Data Set Size