Interactive data visualization

GEOG 30323

November 7, 2017

Interactive visualization

  • Thus far: you’ve learned how to create static charts with pandas and seaborn
  • Our focus now turns to interactive charts

Data storytelling

Data journalism

Source: The New York Times

Interactive visualization and the web

  • Analytical workflows have become increasingly connected to the web - or even completely web-based
  • Interactive data visualization: graphics not just on the web, but of the web

The past: Java and Flash

The present: HTML5 and JavaScript

The future: WebGL

Direct link to map

Why interactive visualization?

  • User is an active participant rather than a passive observer
  • Key points to consider:
    • What are you visualizing?
    • What is the purpose of your visualization?
    • Who is your audience?
    • In what format and venue will you be presenting the visualization?

New libraries!

  • We’re going to usesome new Python libraries to chart interactively. They include:
    • ipywidgets: a package for interactive mini-apps in the Jupyter Notebook
    • plotly: produces D3.js charts using Python, and can convert your Python charts as well
    • cufflinks: binds pandas to plotly
  • In CoCalc, interactivity only available in the Classical Notebook: change the format to get this code to work

Let’s get some data!

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

mx = pd.read_csv('http://personal.tcu.edu/kylewalker/mexico.csv')

Our basic plot

mx.plot(kind = 'scatter', x = 'pri10', y = 'mus09')

Interactivity in the Notebook

  • In CoCalc: interactivity requires the Classical rather than the Modern notebook
  • Use the command %matplotlib notebook, and your plot becomes interactive!
%matplotlib notebook
mx.plot(kind = 'scatter', x = 'pri10', y = 'mus09')

Interactivity in the Notebook

  • The ipywidgets package allows you to build basic graphical user interfaces (GUIs) to explore your data in the Notebook
  • How it works: supply a function to the interact function in ipywidgets
from ipywidgets import interact

def add_five(x): 
    print(x + 5)

interact(add_five, x = (1, 100))

Interactivity with plots

cols = ['mus09', 'pri10', 'sec10', 'ter10', 'gdp08']

def make_plot(x, y, title): 
    sns.lmplot(data = mx, x = x, y = y)
    plt.title(title)    
    
interact(make_plot, x = cols, y = cols, title = "Enter a plot title!")

Widgets: not just for plots!

def get_accidents(street_name): 
    part1 = 'https://data.fortworthtexas.gov/resource/kr8h-9zxd.json?streetname='
    api_call = part1 + street_name
    return pd.read_json(api_call)

def get_street(street = ['BERRY', 'UNIVERSITY', 'ROSEDALE']): 
    df = get_accidents(street)
    return df.head()
    
interact(get_street) 

Plotly

  • Plotly: Web-based platform for interactive charting;
  • Includes both offline components (works in the Jupyter Notebook) and an online cloud/web GUI at https://plot.ly/

Plotly in Python

  • The Plotly Python package can convert your matplotlib plots to Plotly charts!
import plotly.offline as py
py.init_notebook_mode()

mx.plot(kind = 'scatter', x = 'pri10', y = 'mus09')
fig = plt.gcf() # "Get current figure"
py.iplot_mpl(fig)

Plotly and seaborn

  • Plotly can convert some seaborn plots - though the styling will not always carry over
sns.distplot(mx.pri10)
f = plt.gcf()
py.iplot_mpl(f)

The Plotly cloud

  • Plotly plots are fully editable and saveable in the cloud if you have a Plotly account; click “Export to plot.ly” on your chart
  • Demo

Plotly.py

  • More advanced visualizations can be developed with the Plotly Python library
  • How it works: specify a list of graph objects to be plotted, along with layout options (optional) to customize the chart appearance; supply to the iplot() function to view in the Jupyter Notebook

3-dimensional plots

import numpy as np
import plotly.graph_objs as go

arr = np.random.randn(10, 10)

data = [go.Surface(z = arr)]

py.iplot(data)

Widgets and range sliders

  • Step 1: Get the data
from pandas_datareader import wb

ind = "SP.DYN.TFRT.IN"

tfr = wb.download(country = "all", indicator = ind, start = 1960, end = 2013)

tfr.reset_index(inplace = True)

Widgets and range sliders

  • Step 2: build the plotting function
def tfr_plot(country1, country2, country3):
    countries = [country1, country2, country3]    
    subset = tfr[tfr.country.isin(countries)]
    subset_wide = subset.pivot(index = "year", columns = "country", 
                               values = "SP.DYN.TFRT.IN")
    plot_data = [
        go.Scatter(x = subset_wide.index, 
                  y = subset_wide[country], 
                  name = country, 
                  mode = "lines")
        for country in countries
    ]
    plot_layout = go.Layout(title = "Total fertility rate", 
                           xaxis = dict(rangeslider = dict()), 
                           yaxis = dict(title = "TFR"))
    fig = dict(data = plot_data, layout = plot_layout)
    py.iplot(fig)

Widgets and range sliders

  • Step 3: interact with the plot
country_list = list(tfr.country.unique())
    
interact(tfr_plot, country1 = country_list, 
         country2 = country_list, country3 = country_list)

Saving your plots

  • If you have a Plotly account, you can save plots with the default styling to Plotly’s cloud by clicking “Export to plot.ly”
  • Otherwise: plots can be saved as standalone HTML outside of the notebook
myplot = mx.iplot(kind = 'scatter', mode = 'markers', x = 'pri10', 
         y = 'mus09', text = 'name', colors = 'green', 
         xTitle = '% of workforce in primary sector', 
         yTitle = 'Out-migration rate to the United States', 
         asFigure = True)
         
py.plot(myplot, filename = 'mexico_plot.html')