Writing about visualization, demographics, dashboards, and spatial data science.

Interested in learning more? Hire me for a workshop or to consult on your next project. See the Services page for more details.
last update:

Every fall, I teach a course on exploratory data analysis and data visualization using Python via the Anaconda distribution and the Jupyter Notebook. This past semester, I ran the course using SageMathCloud, an online platform created by William Stein that delivers a cloud-based computational data analysis environment with access to Python, R, Julia, and several other languages. My experience with SageMathCloud was incredibly positive - I’ll go so far as to say that it is the best teaching tool I have ever used in my career.

Last week, I had the opportunity to lead a Geographic Information Systems workshop at the 2017 Society for Historical Archaeology Conference. During the day-long workshop, I introduced participants to a wide variety of key GIS concepts, and taught them how to apply those concepts to a series of historical topics. During the workshop, we used ArcGIS (ArcMap and ArcScene) as well as CARTO. By the end of the day, students had learned how to interactively map Civil War battle sites from the National Park Service in CARTO that can be filtered with a time slider widget, as in the embedded map below:

In November, the new simple features package for R sf hit CRAN. The package is like rgdal, sp, and rgeos rolled into one, is much faster, and allows for data processing with dplyr verbs! Also, as sf objects are represented in a much simpler way than sp objects, it allows for spatial analysis in R within magrittr pipelines. This post showcases some of this functionality in a simulated spatial analysis workflow, in which an analyst wants to determine whether customers have visited a point of interest (POI) based on GPS tracking data.

I strongly believe that interactive reports, presentations, and scholarly articles are going to become much more prominent in the years ahead. Whereas a PDF article or presentation can often only show a limited aspect of a research project, interactive documents can allow a reader or presenter to explore project content in a much broader sense. For dynamic research documents, an excellent option is to combine Shiny with R Markdown to generate a report that can execute R code from a Shiny server.

I noticed Ari Lamstein’s call for submissions to the R Shapefile Contest with interest. Commonly, we see spatial data in R used for visualization - e.g. choropleth maps. However, R has a massive ecosystem available to use spatial data in a wide variety of analyses that leverage its geographic properties. I commonly read posts about whether spatial data is “special” or not - we geographers tend to say yes (see here: https://www.

This week, Emily Badger and Darla Cameron at The Washington Post’s Wonkblog published an article (linked here) discussing data from the Federal Housing Finance Agency that suggest that the greatest increase in house prices in large metropolitan areas tend to be found in urban rather than suburban areas. Wonkblog published a series of maps illustrating this for Washington DC, Portland OR, Houston TX, Denver CO, and Minneapolis-St. Paul, MN. I was particularly impressed by the visuals produced by the Wonkblog team, but wanted to see if the trend is replicated in my metropolitan area, Dallas-Fort Worth.

Exploring flows between origins and destinations visually is a common task, but can be difficult to get right. In R, there are many tutorials on the web that show how to produce static flow maps (see here, here, here, and here, among others). Over the past couple years, R developers have created an infrastructure to bridge R with JavaScript using the htmlwidgets package, allowing for the generation of interactive web visualizations straight from R.

Mapbox recently announced that map styles designed in the new Mapbox Studio are now available as basemaps in other platforms, such as Tableau, CartoDB, and ArcGIS Online: https://www.mapbox.com/blog/use-studio-styles-in-gis-tools/. Previously, this wasn’t possible due to these tools’ incompatibility with the GL-based vector tiles produced by Studio. However, Mapbox now translates GL vector tiles to tiles that are compatible with these products, as well as Leaflet.js, with its Tiles API: https://www.mapbox.com/blog/mapbox-studio-tiles-static/. This means that beautiful maps designed in Studio are accessible to R users as well!

It’s been a while since I last posted here - but I’ve been working on a new R package that I’m quite excited about, and I thought this would be the right place to post. My new package, idbr, is an R interface to the United States Census Bureau’s International Data Base API. The IDB includes a host of international demographic indicators - including historical data and projections to 2050. I use IDB data all of the time for my teaching - and idbr makes the process of getting the data much easier!

I use data from the US Census Bureau’s American Community Survey all of the time. I also use R all of the time. Naturally, this means that I often use ACS data in R - which is pertinent given last week’s release of the new 2010-2014 ACS estimates. I wanted easy access to the data to facilitate my on-going research on demographic trends in US metros, and work at the TCU Center for Urban Studies; as such, I wrote a small R package to provide quick access to the data, acs14lite (https://github.