Writing about visualization, demographics, dashboards, and spatial data science.

Interested in learning more? Hire me for a workshop or to consult on your next project. See the Services page for more details.

Exploring the United Nations population projections with rCharts

· by Kyle Walker · Read in about 6 min · (1175 Words)
R

Please note: some NVD3 charts are performing very slowly in the latest version of Google Chrome at the moment; see this GitHub issue. As such, this post is best viewed in other browsers.

I recently came across this really interesting post from Ben Jones that explores the history and future of world population change with Tableau. I haven’t used Tableau much, but I was impressed with the different ways in which Ben used the software to visualize various aspects of global population change. More broadly, I can see multiple ways in which a visualization like this would be useful in teaching a course like World Regional Geography. An instructor could ask students to answer a series of questions regarding global population change, using the visualizations as their evidence. Students would then report back to the class about what they’ve learned from the visualizations, presenting different views to support their answers to the questions.

Reading this post motivated me to share a couple more basic visualizations that I’ve used in my teaching to support discussions of global population change. These are some of the first charts I created with rCharts when I first started working with the package late last year. However, I’ve updated the code below to use the new Hadley Wickham package dplyr, which makes data munging in R so much easier. Eventually, I would like to create a Shiny application that is similar to what Ben created with Tableau, but instead using R and rCharts.

In the examples below, I use population projection data from the United Nations’ World Population Prospects, 2012 Revision. I create the visualizations with NVD3. The first example is a stacked area chart that shows the changing regional distribution of the world’s population between 1950 and 2100.

library(dplyr)
library(rCharts)
library(RColorBrewer)

dat <- read.csv("http://esa.un.org/wpp/ASCII-Data/ASCII_FILES/WPP2012_DB02_POPULATIONS_ANNUAL.csv")

## Alternatively, download the file from the above link and save it in your working directory

# library(data.table)
# dat <- fread("WPP2012_DB02_POPULATIONS_ANNUAL.csv")

regions <- c("Africa", "Latin America and the Caribbean", "Northern America", "Europe", "Oceania", "Asia")

region_dat <- dat %>%
  filter(VarID == 2, 
         Location %in% regions) %>%
  mutate(billions = PopTotal / 1000000) %>%
  select(Location, Time, billions)

# Stacked area chart by region

c1 <- nPlot(billions ~ Time, 
            group = "Location", 
            data = region_dat, 
            type = "stackedAreaChart")

c1$chart(color = brewer.pal(6, "Set2"))
c1$yAxis(tickFormat= "#!d3.format(',.1f')!#")
c1$yAxis(axisLabel = "Population (billions)", width = 62)
c1$xAxis(axisLabel = "Year")

c1$chart(tooltipContent = "#! function(key, x, y){
        return '<h3>' + key + '</h3>' + 
        '<p>' + y + ' billion in ' + x + '</p>'
        } !#")
        

The above code reads the data file directly into R from the UN’s website; however this can be slow (about a couple minutes) so if you’d prefer, I’d advise downloading the CSV to your working directory and using fread from the data.table package to read it in (there is an issue with using fread to read CSVs from websites on Windows). I then subset the data into a new data frame, region_dat, using dplyr. If you are unfamiliar with dplyr, I highly recommend that you check it out; I have written about other use cases for dplyr here.

The remaining code creates the NVD3 chart with rCharts. Notice that when working with rCharts, you sometimes have to pass in JavaScript directly to modify the chart options, which is enclosed in "#!...!#". I do this when formatting the y-axis values as well as the content of the tooltip. Notice that key refers to the group value, y is the y-axis value, and x is the x-axis value in the JavaScript function. You can also pass in other values to the tooltip with NVD3; I show how in this GitHub gist.

Typing c1 in the console then produces the chart below:

I used this chart during the first week of my World Regional Geography class to discuss population change in global context. The region that stands out, of course, is Africa, which is projected to almost-quadruple in population by the end of century. If these projections pan out, Africa would have over 4 billion residents by 2100, and account for around 40 percent of the world’s population. The interactivity in the chart then allows for more detailed exploration of the data. By clicking on the Europe data series, for example, the user can hide all of the other regions, and show how the continent’s population is projected to peak in the next several years and then begin to decline. These figures provided important context to many of the thematic topics specific to these regions I covered later in the class. Additionally, the chart can be modified with a single click to a “Stream” or “Expanded” 100% view (try it!), for different views of the data.

I also created another visualization to show UN projections for the world’s population based on different projection scenarios. The code and resultant chart are below.

variants <- c("Low", "Medium", "High", "Constant fertility")

global_dat <- dat %>%
  filter(Location == "World", 
         Variant %in% variants,
         Time >= 2012) %>%
  mutate(varfactor = factor(Variant,
                            levels = c("Constant fertility",
                                       "High",
                                       "Medium",
                                       "Low")), 
         billions = PopTotal / 1000000) %>%
  select(varfactor, Time, billions) %>%
  arrange(varfactor)

c2 <- nPlot(billions ~ Time, 
            group = "varfactor", 
            data = global_dat, 
            type="lineChart")

c2$chart(color = brewer.pal(4, "Set2"))
c2$yAxis(tickFormat= "#!d3.format(',.1f')!#")
c2$yAxis(axisLabel = "Population (billions)", width = 62)
c2$xAxis(axisLabel = "Year")


c2$chart(tooltipContent = "#! function(key, x, y){
        return '<h3>' + 'Variant: ' + key + '</h3>' + 
        '<p>' + y + ' billion in ' + x + '</p>'
        } !#")
c2

The chart shows wide disparities between the UN’s projection scenarios. If global fertility were to remain at 2012 levels, the world’s population would be expected to reach nearly 28 billion by the end of the century! However, any reasonable projection scenarios take into account global declines in fertility. While the annual rate of world population growth has been declining for several decades, demographic momentum is projected to carry the world’s population to close to 11 billion by 2100, according to the most commonly-cited Medium projection variant. However, it is notable that if global fertility were to decline more than expected, the world’s population could peak in size and then decline over the next century. This is shown by the Low variant, which is calculated with a fertility rate half a child lower than the medium variant. If you are interested in learning more about the UN’s projections and methodology, read their report here.

The UN population projection dataset contains a lot more information than just this, however. Projections are included for every country, and other files are available with additional demographic information at the UN’s website. As such, there are many other ways to visualize these data; I envision a Shiny application in which the user could interactively select countries and compare projection scenarios.

You are welcome to use these visualizations if you’d like, or feel free to get the code from GitHub and build on this. If you do, I’d love to hear about what you’ve created!

Thanks to:

  • The UN Population Division for making all their data freely and publicly available;
  • The developers of rCharts and dplyr