Categories
Tech

Scientific Python for Raspberry Pi

Raspberry Pi 3 Model BA guide to setting up the Python scientific stack, well-suited for geospatial analysis, on a Raspberry Pi 3. The whole process takes just a few minutes.

The Raspberry Pi 3 was announced two weeks ago and presents a substantial step up in computational power over its predecessors. It can serve as a functional Wi-Fi connected Linux desktop computer, albeit underpowered. However it’s perfectly capable of running the Python scientific computing stack including Jupyter, pandas, matplotlib, scipy, scikit-learn, and OSMnx.

Despite (or because of?) its low power, it’s ideal for low-overhead and repetitive tasks that researchers and engineers often face, including geocoding, web scraping, scheduled API calls, or recurring statistical or spatial analyses (with small-ish data sets). It’s also a great way to set up a simple server or experiment with Linux. This guide is aimed at newcomers to the world of Raspberry Pi and Linux, but who have an interest in setting up a Python environment on these $35 credit card sized computers. We’ll run through everything you need to do to get started (if your Pi is already up and running, skip steps 1 and 2).

Categories
Data

World Population Projections

Batman and Robin: By 2050, 70% of the world's population...The U.N. world population prospects data set depicts the U.N.’s projections for every country’s population, decade by decade through 2100. The 2015 revision was recently released, and I analyzed, visualized, and mapped the data (methodology and code described below).

The world population is expected to grow from about 7.3 billion people today to 11.2 billion in 2100. While the populations of Eastern Europe, Taiwan, and Japan are projected to decline significantly over the 21st century, the U.N. projects Africa’s population to grow by an incredible 3.2 billion people. This map depicts each country’s projected percentage change in population from 2015 to 2100:

UN world population projections data map: Africa, Asia, Australia, Europe, North America, South America

Categories
Academia

Urban Informatics and Visualization at UC Berkeley

The fall semester begins next week at UC Berkeley. For the third year in a row, Paul Waddell and I will be teaching CP255: Urban Informatics and Visualization, and this is my first year as co-lead instructor.

This masters-level course trains students to analyze urban data, develop indicators, conduct spatial analyses, create data visualizations, and build Paris open datainteractive web maps. To do this, we use the Python programming language, open source analysis and visualization tools, and public data.

This course is designed to provide future city planners with a toolkit of technical skills for quantitative problem solving. We don’t require any prior programming experience – we teach this from the ground up – but we do expect prior knowledge of basic statistics and GIS.

Update, September 2017: I am no longer a Berkeley GSI, but Paul’s class is ongoing. Check out his fantastic teaching materials in his GitHub repo. From my experiences here, I have developed a course series on urban data science with Python and Jupyter, available in this GitHub repo.

Categories
Planning

Visualizing Craigslist Rental Listings

Our paper on collecting and analyzing U.S. housing rental markets through Craigslist rental listings has been accepted for publication by the Journal of Planning Education and Research. Check out the article here. This map of rental listings in the contiguous U.S. is divided into quintiles by rent per square foot:

Map of 1.5 million Craigslist rental listings in the contiguous U.S., divided into quintiles by each listing's rent per square foot
Map of 1.5 million Craigslist rental listings in the contiguous US, summer 2014

Categories
Data

Visualizing Summer Travels Part 6: Projecting Spatial Data with Python

This post is part of a series on visualizing data from my summer travels.

I’ve previously discussed visualizing the GPS location data from my summer travels with CartoDB, Leaflet, and Mapbox + Tilemill. I also visualized different aspects of this data set in Python, using the matplotlib plotting library. However, these spatial scatter plots used unprojected lat-long data which looked pretty distorted at European latitudes.

Today I will show how to convert this data into a projected coordinate reference system and plot it again using matplotlib. These projected maps will provide a much more accurate spatial representation of my spatial data and the geographic region. All of my code is available in this GitHub repo, particularly this notebook.

Categories
Data

Using geopandas on Windows

projected-shapefile-gps-coordinatesThis guide was written in 2014 and updated slightly in November 2020.

I recently went through the exercise of installing geopandas on Windows. Having learned several valuable lessons, I thought I’d share them with the world in case anyone else is trying to get this toolkit working in a Windows environment. It seems that pip installing geopandas usually works fine on Linux and Mac. However, several of its dependencies have C extensions that can cause compilation failures with pip on Windows. This guide gets around that issue.

Categories
Data

Visualizing Summer Travels

projected-shapefile-gps-coordinatesThis is a series of posts about visualizing spatial data. I spent a couple of months traveling in Europe this summer and collected GPS location data throughout the trip with the OpenPaths app. I explored different web mapping technologies such as CartoDB, Leaflet, Mapbox, and Tilemill to plot my travels. I also used Python and matplotlib to run some descriptive statistics and visualize other aspects of my trip.

Here is the series of posts:

My Python code is available in this GitHub repo. I also did some more involved work under the hood to prep the data and support these visualizations. For example, in the following posts I reverse-geocoded the spatial data set and reduced its size with clustering algorithms and the Douglas-Peucker algorithm:

Categories
Data

Visualizing Summer Travels Part 5: Python + Matplotlib

This post is part of a series on visualizing data from my summer travels.

I’ve previously discussed visualizing the GPS location data from my summer travels with CartoDB, Leaflet, and Mapbox + Tilemill. Today I will explore visualizing this data set in Python, using the matplotlib plotting library. All of my code is available in this GitHub repo, particularly this notebook.

Categories
Data

Visualizing Summer Travels Part 4: Mapbox + Tilemill

This post is part of a series on visualizing data from my summer travels.

I’ve previously discussed my goals in visualizing GPS data from my summer travels and explored visualizing the data set with CartoDB and with Leaflet. The full OpenPaths location data from my summer travels is available here and I discussed how I reverse-geocoded it here.

Mapbox is a major provider of online web mapping services such as tiled web maps, the Tilemill cartography IDE, and the mapbox.js javascript library. Today I’ll run through how to create an interactive data map in Tilemill’s design studio, export the map as a set of tiles, upload the tileset to Mapbox, and then use a javascript client to display the map on a web page. Our final result will look something like this:

Categories
Data

Visualizing Summer Travels Part 3: Leaflet

This post is part of a series on visualizing data from my summer travels.

I’ve previously discussed my goals in visualizing GPS data from my summer travels and explored visualizing the data set with CartoDB. The full OpenPaths location data from my summer travels is available here and I discussed how I reverse-geocoded it here.

Lastly, I reduced the size of this spatial data set so Leaflet can render it more quickly on low-power mobile devices. I discussed why this is important and how to do it with the DBSCAN clustering algorithm and also with the Douglas-Peucker algorithm. The final data set I’ll be working with is available here.