Building a Scatterplot with Pandas and Seaborn

Pandas and Seaborn go together like lemon and lime. In the code below, we're using Pandas to construct a dataframe from a CSV file and Seaborn (which sits on top of matplotlib and makes it look a million times better) is handling the visualisation end of things.

The dataframe consists of three columns, passiveYearactiveYear and Vala where:

activeYear = the year of a case that considered an earlier case

passiveYear = the year of a case which has itself been considered

Vala = the type of consideration the active case meted out against the passive case.


import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

# Parse CSV, bring in the columns and drop null values
df = pd.read_csv('hoad.csv', usecols=['Vala', 'passiveYear', 'activeYear']).dropna()

# Build a grid consisting of a chart for each Vala type

grid = sns.FacetGrid(df, col="Vala", hue="Vala", col_wrap=3, size=3)

# Draw a horizontal line to show the rough midway point along the y axis, y=1907, ls=":", c=".5")

# Plot where x=active year and y=passiveyear, "activeYear", "passiveYear", marker="o", alpha=0.5)

# Adjust the tick positions and labels
grid.set(xticks=[1800,2015], yticks=[1800,2015],
         xlim=(1955, 2015), ylim=(1800, 2015))

# Adjust the arrangement of the plots

This code yields the following visualisation: