Tag: numpy

Tech

Scientific Python for Raspberry Pi

Post author By gboeing
Post date 2016-03-14
25 Comments on Scientific Python for Raspberry Pi

A guide to setting up the Python scientific stack, well-suited for geospatial analysis, on a Raspberry Pi 3. The whole process takes just a few minutes.

The Raspberry Pi 3 was announced two weeks ago and presents a substantial step up in computational power over its predecessors. It can serve as a functional Wi-Fi connected Linux desktop computer, albeit underpowered. However it’s perfectly capable of running the Python scientific computing stack including Jupyter, pandas, matplotlib, scipy, scikit-learn, and OSMnx.

Despite (or because of?) its low power, it’s ideal for low-overhead and repetitive tasks that researchers and engineers often face, including geocoding, web scraping, scheduled API calls, or recurring statistical or spatial analyses (with small-ish data sets). It’s also a great way to set up a simple server or experiment with Linux. This guide is aimed at newcomers to the world of Raspberry Pi and Linux, but who have an interest in setting up a Python environment on these $35 credit card sized computers. We’ll run through everything you need to do to get started (if your Pi is already up and running, skip steps 1 and 2).

Tags api, basemap, data, data science, geocoding, geopandas, geopy, geospatial, iot, ipython, jupyter, linux, matplotlib, numpy, pandas, pyproj, raspberry pi, raspbian, science, scikit-learn, scipy, scrapy, shapely, statistics, statsmodels, web scraping

Data

The Landscape of U.S. Rents

Post author By gboeing
Post date 2015-11-19
3 Comments on The Landscape of U.S. Rents

Which U.S. cities are the most expensive for rental housing? Where are rents rising the fastest? The American Community Survey (ACS) recently released its latest batch of 1-year data and I analyzed, mapped, and visualized it. My methodology is below, and my code and data are in this GitHub repo.

This interactive map shows median rents across the U.S. for every metro/micropolitan area. Click any one for details on population, rent, and change over time. Click “switch” to re-draw the map to visualize how median rents have risen since 2010:

Tags basemap, census, cities, data, gis, housing, javascript, leaflet, maps, matplotlib, numpy, pandas, population, python, rents, statsmodels, united states

Academia

Urban Informatics and Visualization at UC Berkeley

Post author By gboeing
Post date 2015-08-20
10 Comments on Urban Informatics and Visualization at UC Berkeley

The fall semester begins next week at UC Berkeley. For the third year in a row, Paul Waddell and I will be teaching CP255: Urban Informatics and Visualization, and this is my first year as co-lead instructor.

This masters-level course trains students to analyze urban data, develop indicators, conduct spatial analyses, create data visualizations, and build Paris open data interactive web maps. To do this, we use the Python programming language, open source analysis and visualization tools, and public data.

This course is designed to provide future city planners with a toolkit of technical skills for quantitative problem solving. We don’t require any prior programming experience – we teach this from the ground up – but we do expect prior knowledge of basic statistics and GIS.

Update, September 2017: I am no longer a Berkeley GSI, but Paul’s class is ongoing. Check out his fantastic teaching materials in his GitHub repo. From my experiences here, I have developed a course series on urban data science with Python and Jupyter, available in this GitHub repo.

Tags academia, anaconda, arcgis, berkeley, cartodb, city, code for america, data, data science, geocoding, geopandas, geopy, geospatial, gis, github, javascript, land use, leaflet, localdata, mapbox, maps, matplotlib, modeling, numpy, pandas, planning, projection, qgis, science, scikit-learn, scipy, scrapy, shapely, smart cities, socrata, statistics, tilemill, tutorial, urban, urban design, urban planning, visualization, wordpress

Data

Visualizing Summer Travels Part 5: Python + Matplotlib

Post author By gboeing
Post date 2014-08-29
1 Comment on Visualizing Summer Travels Part 5: Python + Matplotlib

This post is part of a series on visualizing data from my summer travels.

I’ve previously discussed visualizing the GPS location data from my summer travels with CartoDB, Leaflet, and Mapbox + Tilemill. Today I will explore visualizing this data set in Python, using the matplotlib plotting library. All of my code is available in this GitHub repo, particularly this notebook.

Tags clustering, data, dbscan, geopandas, geopy, geospatial, gis, maps, matplotlib, numpy, pandas, python, shapefiles, shapely, travel, tutorial, visualization

Data

Clustering to Reduce Spatial Data Set Size

Post author By gboeing
Post date 2014-08-20
49 Comments on Clustering to Reduce Spatial Data Set Size

Read/cite the paper here.

In this tutorial, I demonstrate how to reduce the size of a spatial data set of GPS latitude-longitude coordinates using Python and its scikit-learn implementation of the DBSCAN clustering algorithm. All my code is in this IPython notebook in this GitHub repo, where you can also find the data.

Traditionally it’s been a problem that researchers did not have enough spatial data to answer useful questions or build compelling visualizations. Today, however, the problem is often that we have too much data. Too many scattered points on a map can overwhelm a viewer looking for a simple narrative. Furthermore, rendering a JavaScript web map (like Leaflet) with millions of data points on a mobile device can swamp the processor and be unresponsive.

Tags clustering, data, dbscan, geopy, geospatial, gis, k-means, maps, numpy, pandas, python, scikit-learn, scipy, tutorial, visualization