Data communication and interactive visualization

GEOG 30323

April 16, 2024

Course recap

  • Thus far: we’ve focused on exploratory data analysis, which involves data wrangling, summarization, and visualization
  • Your data analysis journey shouldn’t stop here! Topics to consider:
    • Explanatory vs. exploratory visualization
    • Statistics and data science
    • Data ethics and “big data”

Communicating with data

  • Once you’ve done all of the hard work wrangling your data, you’ll want to communicate insights to others!
  • This might include:
    • Polished data products or reports
    • Models that can scale your insights

Explanatory visualization

  • We’ve largely worked to this point with exploratory visualization, which refers to internally-facing visualizations that help us reveal insights about our data
  • Often, externally-facing data products will include explanatory visualization, which include a polished design and emphasize one or two key points

Infographics

Obesity infographics:

Are infographics useful?

Interactive reports

  • Example: a data journalism article - or your Jupyter Notebook!
  • Key distinction: your code, data exploration, etc. will likely be external to the report (this can vary depending on the context, however)

Data storytelling

Why interactive visualization?

  • User is an active participant rather than a passive observer
  • Key points to consider:
    • What are you visualizing?
    • What is the purpose of your visualization?
    • Who is your audience?
    • In what format and venue will you be presenting the visualization?

Interactive data journalism

Source: The New York Times

Interactivity in the Notebook

  • The ipywidgets package allows you to build basic graphical user interfaces (GUIs) to explore your data in the Notebook
  • How it works: supply a function to the interact function in ipywidgets
from ipywidgets import interact

def add_five(x): 
    print(x + 5)

interact(add_five, x = (1, 100))

Interactivity with plots

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

mx = pd.read_csv('http://personal.tcu.edu/kylewalker/mexico.csv')

cols = ['mus09', 'pri10', 'sec10', 'ter10', 'gdp08']

def make_plot(x, y, title): 
    sns.lmplot(data = mx, x = x, y = y)
    plt.title(title)
    plt.show()
    
interact(make_plot, x = cols, y = cols, title = "Enter a plot title!")

Widgets: not just for plots!

def get_accidents(street_name): 
    part1 = 'https://data.fortworthtexas.gov/resource/kr8h-9zxd.json?streetname='
    api_call = part1 + street_name
    return pd.read_json(api_call)

def get_street(street = ['BERRY', 'UNIVERSITY', 'ROSEDALE']): 
    df = get_accidents(street)
    return df.head()
    
interact(get_street) 

Ipywidgets fundamentals

Arguments used in interact() correspond to the following:

  • A boolean (True or False) returns a checkbox
  • A string returns a text input
  • A number (integer or float) returns a slider
  • A tuple of numbers (start, end, step size) returns a range slider
  • A list returns a drop-down menu

Experiment with the functions you’ve written this semester!

Interactive visualization and the web

  • Analytical workflows have become increasingly connected to the web - or even completely web-based
  • Interactive data visualization: graphics not just on the web, but of the web

Filtering & highlighting

Image source: The New York Times

Linked charts

Link to live example

Interactive visualization in Python

Altair example

import altair as alt

alt.Chart(mx).mark_circle(size = 60).encode(
    x = "pri10",
    y = "mus09",
    tooltip = ["name", "mus09", "pri10"]
).interactive()

Tableau

  • Highly popular software for data visualization - both exploratory and explanatory
  • Intuitive, drag-and-drop interface
  • Key feature: the dashboard

Data dashboards

Demo: Tableau Public