Writing about visualization, demographics, dashboards, and spatial data science.

Interested in learning more? Hire me for a workshop or to consult on your next project. See the Services page for more details.

Visualizing international demographic indicators with idbr and Plotly

· by Kyle Walker · Read in about 4 min · (704 Words)
R

It’s been a while since I last posted here - but I’ve been working on a new R package that I’m quite excited about, and I thought this would be the right place to post. My new package, idbr, is an R interface to the United States Census Bureau’s International Data Base API. The IDB includes a host of international demographic indicators - including historical data and projections to 2050. I use IDB data all of the time for my teaching - and idbr makes the process of getting the data much easier! While this product uses the Census Bureau Data API, it is not endorsed or certified by the Census Bureau.

Install from CRAN with the following command:

install.packages('idbr')

To get started, you’ll need a Census API key; this can be obtained from http://api.census.gov/data/key_signup.html if you don’t already have one. Before downloading data, set your API key for your idbr session with the set_api_key function:

library(idbr)

idb_api_key('Your API key goes here')

There are two main functions in the idbr package. idb1 fetches population data by one-year age bands for one or more countries in one or more years, optionally by age ranges or by sex. idb5 has a lot more indicators available, including total fertility rate, life expectancy, and population by five-year age ranges. To view all of the variables available in the idb function, call idb_variables(). Groups of similar variables, termed concepts, can be fetched at once; see the available concepts with idb_concepts().

Below are some examples of how to use the package with Plotly’s fantastic new R client. Browse the code for idbr at https://github.com/walkerke/idbr, and please let me know if you have any feedback!

Please note: the embedded visualizations are crashing my browser on my mobile device, so I’ve set it so they won’t show up on phones. To view the graphics, take a look at the post on your computer.

World map of infant mortality rates by country for 2016:

library(plotly)
library(viridis)

df <- idb5(country = 'all', year = 2016, variable = 'IMR', country_name = TRUE)

plot_ly(df, z = IMR, text = NAME, locations = NAME, locationmode = 'country names',
        type = 'choropleth', colors = viridis(99), hoverinfo = 'text+z') %>%
  layout(title = 'Infant mortality rate (per 1000 live births), 2016', 
         geo = list(projection = list(type = 'robinson')))

Projected population pyramid of China in 2050:

library(dplyr)

male <- idb1('CH', 2050, sex = 'male') %>%
  mutate(POP = POP * -1,
         SEX = 'Male')

female <- idb1('CH', 2050, sex = 'female') %>%
  mutate(SEX = 'Female')

china <- rbind(male, female) %>%
  mutate(abs_pop = abs(POP))

plot_ly(china, x = POP, y = AGE, color = SEX, type = 'bar', orientation = 'h',
        hoverinfo = 'y+text+name', text = abs_pop, colors = c('red', 'gold')) %>%
  layout(bargap = 0.1, barmode = 'overlay',
         xaxis = list(tickmode = 'array', tickvals = c(-10000000, -5000000, 0, 5000000, 10000000),
         ticktext = c('10M', '5M', '0', '5M', '10M')), 
         title = 'Projected population structure of China, 2050')

Life expectancy at birth by sex compared in a Shiny app

First, get the data from idbr then save out (so you don’t have to call the API each time):

# setup.R

library(idbr)

idb_api_key("Your API key here")

full <- idb5(country = 'all', year = '2016', variables = c('E0_F', 'E0_M'), country_name = TRUE)

save(full, file = 'idbr_data.rds')

Next, build the app:

# app.R

library(shiny)
library(countrycode)
library(plotly)
library(dplyr)
library(tidyr)

load('idbr_data.rds')

ui <- fluidPage(

  titlePanel("Life expectancy at birth by country and sex"),

  sidebarLayout(
    sidebarPanel(
      selectInput("region",
                  "Select region to plot:",
                  choices = sort(unique(countrycode_data$region)),
                  selected = 'Northern Africa')
    ),

    mainPanel(
      plotlyOutput("dumbbell")
    )
  )
)

server <- function(input, output) {

  regiondf <- reactive({

    reg <- countrycode_data[countrycode_data$region == input$region, ]

    fips <- reg$fips104

    sub <- full %>%
      filter(FIPS %in% fips) %>%
      rename(Male = E0_M, Female = E0_F) %>%
      arrange(Female)

    sub

  })

  output$dumbbell <- renderPlotly({

    regiondf() %>%
      gather(Sex, value, Male, Female) %>%
      plot_ly(x = value, y = NAME, mode = 'lines',
              group = NAME, showlegend = FALSE, line = list(color = 'gray'),
              hovermode = FALSE, hoverinfo = 'none') %>%
      add_trace(x = value, y = NAME, color = Sex, mode = 'markers',
              colors = c('darkred', 'navy'), marker = list(size = 10)) %>%
      layout(xaxis = list(title = 'Life expectancy at birth'),
             yaxis = list(title = ''),
             margin = list(l = 120))

  })

}

shinyApp(ui = ui, server = server)