College Football Stadium Attendance

A few months ago, I wrote about the large investments that U.S. universities are making in their football stadiums. This also included a visual analysis of stadium capacity around the country. Outside of North Korea, the 8 largest stadiums in the world are college football stadiums, and the 15 largest college football stadiums are larger than any NFL stadium.

I received a few comments interested in further analysis of the actual attendance of games held in these stadiums. While capacity is interesting because it represents an expectation and sustained investment by the school, attendance represents the utilization of that investment. My stadium capacity data covered every NCAA division I football stadium in the U.S. as of the 2015 college football season. So, I downloaded the NCAA’s 2015 home game attendance data to compare. My data, code, and analysis are in this GitHub repo. First, I visualized the FBS attendance figures themselves:

NCAA college football teams' stadiums' 2015 average attendance per game Continue reading College Football Stadium Attendance

How to Visualize Urban Accessibility and Walkability

Tools like WalkScore visualize how “walkable” a neighborhood is in terms of access to different amenities like parks, schools, or restaurants. It’s easy to create accessibility visualizations like these ad hoc with Python and its pandana library. Pandana (pandas for network analysis – developed by Fletcher Foti during his dissertation research here at UC Berkeley) performs fast accessibility queries over a network. I’ll demonstrate how to use it to visualize urban walkability. My code is in these IPython notebooks in this urban data science course GitHub repo.

First I give pandana a bounding box around Berkeley/Oakland in the East Bay of the San Francisco Bay Area. Then I load the street network and amenities from OpenStreetMap. In this example I’ll look at accessibility to restaurants, bars, and schools. But, you can create any basket of amenities that you are interested in – basically visualizing a personalized “AnythingScore” instead of a generic WalkScore for everyone. Finally I calculate and plot the distance from each node in the network to the nearest amenity:

Berkeley Oakland California street network walking accessibility and walkability Continue reading How to Visualize Urban Accessibility and Walkability

Mapping Everywhere I’ve Ever Been in My Life

I recently wrote about visualizing my Foursquare check-in history and mapping my Google location history, and it inspired me to mount a more substantial project: mapping everywhere I’ve ever been in my life (!!). I’ve got 4 years of Foursquare check-ins and Google location history data. For everything pre-smart phone, I typed up a simple spreadsheet of places I’d visited in the past and then geocoded it with the Google Maps API. All my Python and Leaflet code is available in this GitHub repo and is easy to re-purpose to visualize your own location history.

I’ll show the maps first, then run through the process I followed, below. First off, I used Python and matplotlib basemap to create this map of everywhere I’ve ever been:

Location History World Map, data from Foursquare and Google, made with Python matplotlib basemap

Continue reading Mapping Everywhere I’ve Ever Been in My Life

Scientific Python for Raspberry Pi

Raspberry Pi 3 Model BA guide to setting up the Python scientific stack, well-suited for geospatial analysis, on a Raspberry Pi 3. The whole process takes just a few minutes.

The Raspberry Pi 3 was announced two weeks ago and presents a substantial step up in computational power over its predecessors. It can serve as a functional Wi-Fi connected Linux desktop computer, albeit underpowered. However it’s perfectly capable of running the Python scientific computing stack including Jupyter, pandas, matplotlib, scipy, scikit-learn, and OSMnx.

Despite (or because of?) its low power, it’s ideal for low-overhead and repetitive tasks that researchers and engineers often face, including geocoding, web scraping, scheduled API calls, or recurring statistical or spatial analyses (with small-ish data sets). It’s also a great way to set up a simple server or experiment with Linux. This guide is aimed at newcomers to the world of Raspberry Pi and Linux, but who have an interest in setting up a Python environment on these $35 credit card sized computers. We’ll run through everything you need to do to get started (if your Pi is already up and running, skip steps 1 and 2). Continue reading Scientific Python for Raspberry Pi

World Population Projections

Batman and Robin: By 2050, 70% of the world's population...The U.N. world population prospects data set depicts the U.N.’s projections for every country’s population, decade by decade through 2100. The 2015 revision was recently released, and I analyzed, visualized, and mapped the data (methodology and code described below).

The world population is expected to grow from about 7.3 billion people today to 11.2 billion in 2100. While the populations of Eastern Europe, Taiwan, and Japan are projected to decline significantly over the 21st century, the U.N. projects Africa’s population to grow by an incredible 3.2 billion people. This map depicts each country’s projected percentage change in population from 2015 to 2100:

UN world population projections data map: Africa, Asia, Australia, Europe, North America, South America

Continue reading World Population Projections

Urban Informatics and Visualization at UC Berkeley

The fall semester begins next week at UC Berkeley. For the third year in a row, Paul Waddell and I will be teaching CP255: Urban Informatics and Visualization, and this is my first year as co-lead instructor.

This masters-level course trains students to analyze urban data, develop indicators, conduct spatial analyses, create data visualizations, and build Paris open datainteractive web maps. To do this, we use the Python programming language, open source analysis and visualization tools, and public data.

This course is designed to provide future city planners with a toolkit of technical skills for quantitative problem solving. We don’t require any prior programming experience – we teach this from the ground up – but we do expect prior knowledge of basic statistics and GIS.

Update, September 2017: I am no longer a Berkeley GSI, but Paul’s class is ongoing. Check out his fantastic teaching materials in his GitHub repo. From my experiences here, I have developed a cycle of course materials, IPython notebooks, and tutorials towards an urban data science course based on Python, available in this GitHub repo.

Continue reading Urban Informatics and Visualization at UC Berkeley