My article “Housing Search in the Age of Big Data: Smarter Cities or the Same Old Blind Spots?” with Max Besbris, Ariela Schachter, and John Kuk is now published in Housing Policy Debate. We look at the quantity and quality of information in online housing listings and find that they are much higher in White and non-poor neighborhoods than they are in poor, Black, or Latino neighborhoods. Listings in White neighborhoods include more descriptive text and focus on unit and neighborhood amenities, while listings in Black neighborhoods focus more on applicant (dis)qualifications. We discuss what this means for housing markets, filter bubbles, residential sorting and segregation, and housing policy. You can download a free PDF.
Housing search technologies are changing and, as a result, so are housing search behaviors. The most recent American Housing Survey revealed that, for the first time, more urban renters found their current homes through online technology platforms than any other information channel. These technology platforms collect and disseminate user-generated content and construct a virtual agora for users to share information with one another. Because they can provide real-time data about various urban phenomena, housing technology platforms are a key component of the smart cities paradigm.
This paradigm promotes information technology as both a technocratic mode of monitoring cities and a utopian mode of improving urban life through big data. In this context, “big data” typically refers to massive streams of user-generated content resulting from millions or billions of decentralized human actions. Data exhaust from Craigslist and other housing technology platforms offers a good example: optimistically, large corpora of rental listings could provide housing researchers and practitioners with actionable insights for policymaking while also equalizing access to information for otherwise disadvantaged homeseekers. But how good are these platforms at resolving the types of problems that already plague old-fashioned, non-big data? Does this broadcasting of information reduce longstanding geographic and demographic inequalities or do established patterns of segmentation and sorting remain?
On one hand, as a publicly available and free technology platform, Craigslist offers exceptionally low barriers to entry. Unlike newspaper listings or brokers, it requires no payment from landlords to list their own units. With such low barriers to entry, Craigslist could possibly be the most representative exchange of rental information, although use is contingent on access and ability to navigate the internet. While alternative information channels exist for luxury rental listings, low-income listings, and non-English listings, Craigslist is by far the most trafficked and largest single source of rental information. In this paper we review, empirically extend, and retheorize ongoing research projects that collected Craigslist rental listings via web scraping.
We find significant sociospatial differences in both the volume and type of information provided to prospective tenants on Craigslist. Recent work has speculated about the potential of technology platforms to democratize information and broaden homeseeker choice sets, but our findings question these platforms’ ability to make searches more equitable: online rental listings reproduce historical patterns of residential steering, sorting, and (information channel) segregation as a function of existing demographics and inequality. Given the segregated nature of the housing search process, our findings demonstrate that online housing listings are more likely to exacerbate rather than ameliorate inequality.
Listings in poor White, Black, and Latino tracts, as well as non-poor Black and Latino tracts, contain fewer words on average than listings in non-poor White tracts. Listings in poor Black tracts contain fewer average words than any other tract type, and even non-poor Black tracts contain fewer than all other tract types except poor Black and poor Latino tracts. In contrast, listings in poor and non-poor Asian tracts contain more words on average than those posted in all other tracts, including non-poor White tracts. Per-listing information volume varies at the intersection of race and poverty, leaving poor Latino and, particularly, Black tracts the most relatively disadvantaged in terms of information content.
Further, the information in listings in tracts with more Black, Latino, or poorer residents disproportionately focuses on tenant (dis)qualifications (e.g., proof of income, eviction history, criminal history) rather than unit/amenity descriptions. In contrast, listings in whiter or lower-poverty areas contain more information and devote more text to describing units/amenities. A clear relationship exists between tract poverty and the amount and type of information provided, but there is also a racial hierarchy: listings in low-poverty Black or Latino communities contain less information and a stronger focus on tenant (dis)qualifications compared to low-poverty White communities. Listings in White tracts have a more extensive discussion of neighborhood amenities (e.g., proximity to parks and restaurants, public transportation, etc.) than Black or Latino tracts.
We focused on Craigslist here because compared to other housing technology platforms it is the largest and most democratic in that it has minimal barriers to entry and no listing costs. Yet despite being the most accessible platform—and despite having no complicated algorithms that target search results—we still find that user-generated information reproduces traditional information segregation patterns. Other online housing platforms potentially distribute information in even more unequal ways, constructing filter bubbles in the residential search and sorting process. In fact, HUD recently sued Facebook for violating fair housing laws, claiming their platform limits who can see advertisements for housing based on their race/ethnicity, religion, and current location. As cities and citizens increasingly turn to technology platforms to mediate urban processes, more unanticipated consequences, such as these housing information filter bubbles, will likely appear.
New urban technology platforms and their data both reflect our world and shape it. Despite the optimistic rhetoric of smart cities advocates, techno-utopian solutions are rarely equalizing in practice or even in design. Cities increasingly rely on technology platforms as a mode of governance, administration, observation, and participation—but, in many ways, this reliance can reproduce or even exacerbate preexisting inequalities. While these platforms reflect millions of disaggregate transactions among rental market participants, they also reconstruct the market itself. Online listings can reduce housing search costs, expand search radii without requiring physical location visits, and broaden homeseeker choice sets. But housing information quantity and quality vary between neighborhoods, correlated with sociodemographics.
In turn, the information-broadcasting benefits of these housing technology platforms are unevenly distributed among communities. On one hand, this unevenness concentrates the technology’s benefits in privileged communities. On the other hand, it could also open up such communities to information-deprived housing seekers by making whiter, wealthier, and better-educated communities more-equally legible to everyone in the search process through a larger volume of rental listings and higher-quality unit information.
Overall, the differences documented here measure how online platforms with user generated content do not automatically smooth information exchange, reduce information asymmetries, or attenuate entrenched sociospatial inequalities. This is not simply a data problem. Although Craigslist’s biases may be unintended, they nevertheless point to the broader limits of housing technology platforms themselves—which rely on user self-selection and structural market forces to generate information.
For more, check out the paper itself. You might also be interested in my recent paper on rental listings and information inequality, my paper on rental housing spot markets and the dearth of up-to-date rent data, or my earlier paper on using Craigslist to understand rental markets.