Data Visualisation

Authenticating into neo4j in a Jupyter Notebook using py2neo

I recently spend a frustrating few hours trying to replicate these examples in a Jupyter Notebook.

Every time I attempted to run this line...

graph = Graph()

... the trace would go loopy and I'd get a connection refused error. Here's how to get around this problem. 

Start Neo4j

You need a running instance of Neo4j before you even attempt to start running the code. So, either launch the desktop app or, if you prefer, launch an instance from the shell, like so,

$ neo4j start

By default, neo4j will listen on port 7474.

The authentication code

The following code can then be run inside the Notebook (or wherever) and you won't get the error I kept seeing:

from py2neo import authenticate, Graph

# set up authentication parameters
authenticate("localhost:7474", "neo4j", "Nov2015!!")

# connect to authenticated graph database
graph = Graph("http://localhost:7474/db/data/")

Building a Scatterplot with Pandas and Seaborn

Pandas and Seaborn go together like lemon and lime. In the code below, we're using Pandas to construct a dataframe from a CSV file and Seaborn (which sits on top of matplotlib and makes it look a million times better) is handling the visualisation end of things.

The dataframe consists of three columns, passiveYearactiveYear and Vala where:

activeYear = the year of a case that considered an earlier case

passiveYear = the year of a case which has itself been considered

Vala = the type of consideration the active case meted out against the passive case.


import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

# Parse CSV, bring in the columns and drop null values
df = pd.read_csv('hoad.csv', usecols=['Vala', 'passiveYear', 'activeYear']).dropna()

# Build a grid consisting of a chart for each Vala type

grid = sns.FacetGrid(df, col="Vala", hue="Vala", col_wrap=3, size=3)

# Draw a horizontal line to show the rough midway point along the y axis, y=1907, ls=":", c=".5")

# Plot where x=active year and y=passiveyear, "activeYear", "passiveYear", marker="o", alpha=0.5)

# Adjust the tick positions and labels
grid.set(xticks=[1800,2015], yticks=[1800,2015],
         xlim=(1955, 2015), ylim=(1800, 2015))

# Adjust the arrangement of the plots

This code yields the following visualisation: