Categories
Data

Street Network Analysis in a Docker Container

Containerization is the way of the future present. I’ve heard feedback from some folks over the past few months who would like to play around with OSMnx for street network analysis, transport modeling, and urban design—but can’t because they can’t install Python and its data science stack on their computers. Furthermore, it would be nice to have a consistent reference environment to deploy on AWS or elsewhere in the cloud.

So, I’ve created a docker image containing OSMnx, Jupyter, and the rest of the Python geospatial data science stack, available on docker hub alongside additional usage instructions. If you’re starting from scratch, you can get started in four simple steps:

Categories
Data

Network-Based Spatial Clustering

Jobs, establishments, and other amenities tend to agglomerate and cluster in cities. To identify these agglomerations and explore their causes and effects, we often use spatial clustering algorithms. However, urban space cannot simply be traversed as-the-crow-flies: human mobility is network-constrained. To properly model agglomeration along a city’s street network, we must use network-based spatial clustering.

The code for this example can be found in this GitHub repo. We use OSMnx to download and assemble the street network for a small city. We also have a dataframe of points representing the locations of (fake) restaurants in this city. Our restaurants cluster into distinct districts, as many establishments and industries tend to do:

firm locations on the street network to be clustered: python, osmnx, matplotlib, scipy, scikit-learn, geopandas

Categories
Data

OSMnx Features Round-Up

OSMnx is a Python package for quickly and easily downloading, modeling, analyzing, and visualizing street networks and other spatial data from OpenStreetMap. Here’s a quick round-up of recent updates to OSMnx. I’ll try to keep this up to date as a single reference source. A lot of new features have appeared in the past few months, and people have been asking about what’s new and what’s possible. You can:

  • Download and model street networks or other networked infrastructure anywhere in the world with a single line of code
  • Download any other spatial geometries, place boundaries, building footprints, or points of interest as a GeoDataFrame
  • Download by city name, polygon, bounding box, or point/address + network distance
  • Download drivable, walkable, bikeable, or all street networks
  • Download node elevations and calculate edge grades (inclines)
  • Impute missing speeds and calculate graph edge travel times
  • Simplify and correct the network’s topology to clean-up nodes and consolidate intersections
  • Fast map-matching of points, routes, or trajectories to nearest graph edges or nodes
  • Save networks to disk as shapefiles, geopackages, and GraphML
  • Save/load street network to/from a local .osm xml file
  • Conduct topological and spatial analyses to automatically calculate dozens of indicators
  • Calculate and visualize street bearings and orientations
  • Calculate and visualize shortest-path routes that minimize distance, travel time, elevation, etc
  • Visualize street networks as a static map or interactive leaflet web map
  • Visualize travel distance and travel time with isoline and isochrone maps
  • Plot figure-ground diagrams of street networks and building footprints
Categories
Data

Animating the Lorenz Attractor with Python

Edward Lorenz, the father of chaos theory, once described chaos as “when the present determines the future, but the approximate present does not approximately determine the future.”

Lorenz first discovered chaos by accident while developing a simple mathematical model of atmospheric convection, using three ordinary differential equations. He found that nearly indistinguishable initial conditions could produce completely divergent outcomes, rendering weather prediction impossible beyond a time horizon of about a fortnight.

Lorenz system attractor animated GIF created with Python matplotlib scipy numpy PIL

Categories
Tech

Scientific Python for Raspberry Pi

Raspberry Pi 3 Model BA guide to setting up the Python scientific stack, well-suited for geospatial analysis, on a Raspberry Pi 3. The whole process takes just a few minutes.

The Raspberry Pi 3 was announced two weeks ago and presents a substantial step up in computational power over its predecessors. It can serve as a functional Wi-Fi connected Linux desktop computer, albeit underpowered. However it’s perfectly capable of running the Python scientific computing stack including Jupyter, pandas, matplotlib, scipy, scikit-learn, and OSMnx.

Despite (or because of?) its low power, it’s ideal for low-overhead and repetitive tasks that researchers and engineers often face, including geocoding, web scraping, scheduled API calls, or recurring statistical or spatial analyses (with small-ish data sets). It’s also a great way to set up a simple server or experiment with Linux. This guide is aimed at newcomers to the world of Raspberry Pi and Linux, but who have an interest in setting up a Python environment on these $35 credit card sized computers. We’ll run through everything you need to do to get started (if your Pi is already up and running, skip steps 1 and 2).

Categories
Academia

Urban Informatics and Visualization at UC Berkeley

The fall semester begins next week at UC Berkeley. For the third year in a row, Paul Waddell and I will be teaching CP255: Urban Informatics and Visualization, and this is my first year as co-lead instructor.

This masters-level course trains students to analyze urban data, develop indicators, conduct spatial analyses, create data visualizations, and build Paris open datainteractive web maps. To do this, we use the Python programming language, open source analysis and visualization tools, and public data.

This course is designed to provide future city planners with a toolkit of technical skills for quantitative problem solving. We don’t require any prior programming experience – we teach this from the ground up – but we do expect prior knowledge of basic statistics and GIS.

Update, September 2017: I am no longer a Berkeley GSI, but Paul’s class is ongoing. Check out his fantastic teaching materials in his GitHub repo. From my experiences here, I have developed a course series on urban data science with Python and Jupyter, available in this GitHub repo.

Categories
Data

Clustering to Reduce Spatial Data Set Size

Read/cite the paper here.

In this tutorial, I demonstrate how to reduce the size of a spatial data set of GPS latitude-longitude coordinates using Python and its scikit-learn implementation of the DBSCAN clustering algorithm. All my code is in this IPython notebook in this GitHub repo, where you can also find the data.

Traditionally it’s been a problem that researchers did not have enough spatial data to answer useful questions or build compelling visualizations. Today, however, the problem is often that we have too much data. Too many scattered points on a map can overwhelm a viewer looking for a simple narrative. Furthermore, rendering a JavaScript web map (like Leaflet) with millions of data points on a mobile device can swamp the processor and be unresponsive.