Connected Dot Plot Using Python Plotly

One of the regular data science exercises I participate in is Cole Nussbaumer Knaflic’s #SWDChallenge where she posts a challenge at the beginning of each month. Participants are encouraged to produce a specific type of plot that she chose for the month, using any software or language the participant prefers. I had been faithfully joining since April and I have learned a lot so far!

The challenge for August was a dot plot, and as per the terms I took my choice of dataset and made a dot plot from it. I have never made a dot plot before, and thankfully Python’s Plotly has a pretty good tutorial for making one. Pretty soon I was able to create version 1 of my dot plot.

 

I thought it was not a bad first attempt, and so I tweeted my plot to Cole hoping for feedback. As usual she did not disappoint!

 

Arranging my plot into the correct alphabetical order wasn’t too complicated, but the second suggestion of converting it into a connected dot plot stumped me. As far as I could tell, there wasn’t any option to make connected dot plots in Plotly just yet. I would have to improvise, which is the point of this whole post, if you can pardon my lengthy introduction.

I discovered that by combining Plotly’s dot plot and horizontal error bars, I can create an acceptable approximation of a connected dot plot.

import numpy as np
import pandas as pd
import plotly.plotly as py
import plotly.graph_objs as go
import plotly.offline as pyoff
pyoff.init_notebook_mode()

# Setting up the toy data for demonstration
# It is important to create the differences here to specify
# the length of the line connecting the dots later.
toydata = pd.DataFrame()
toydata['Category'] = ['Good', 'Better', 'Best', 'Awesome']
toydata['First Values'] = [3, 6, 2, 4]
toydata['Second Values'] = [7, 8, 5, 5]
diffs = toydata['Second Values'] - toydata['First Values']

# The actual code for creating the connected dot plot
trace1 = go.Scatter(
x=toydata['Second Values'],
y=toydata['Category'],
error_x=dict(
type='data',
symmetric=False,
array=[0]*len(diffs),
arrayminus=diffs,
color='gray',
thickness=4
),
mode='markers',
name='Second Values',
marker=dict(
color='blue',
line=dict(
color='rgba(217, 217, 217, 1.0)',
width=1,
),
symbol='circle',
size=12,
)
)
trace2 = go.Scatter(
x=toydata['First Values'],
y=toydata['Category'],
mode='markers',
name='First Values',
marker=dict(
color='red',
line=dict(
color='rgba(156, 165, 196, 1.0)',
width=1,
),
symbol='circle',
size=12,
)
)

data = [trace1, trace2]
layout = go.Layout(
title="Change in Values for Categories",
width=600,
height=400,
paper_bgcolor='rgb(254, 247, 234)',
plot_bgcolor='rgb(254, 247, 234)')

fig = go.Figure(data=data, layout=layout)
pyoff.iplot(fig, filename='blog-toydata')

 

The code above will result in the following plot:

 

Using the code above, I was able to turn my plot into a better one.

 

You can see my #SWDChallenge submission in the recap post.

Hope this helps when you make your own connected dot plot!

 

 

Becomingsleek Relaunch and Rebrand

It’s been about half a year since my last post, and the biggest reason for that is starting my Master’s degree. I am currently taking my Master of Science in Data Science degree at the Asian Institute of Management which is a full time degree. It’s been quite a big change and the coursework is heavy, but at about two months in I can say it’s definitely worth it so far and it’s fun!

I wanted to record the things I am learning here, and I figured the best way to do that is by blogging about it. This is the relaunch and third rebranding of Becomingsleek: from this point on I will be posting mostly about Data Science, Python, Visualization, and the like. I will be slowly changing the look of the site as well.