One of the regular data science exercises I participate in is Cole Nussbaumer Knaflic’s #SWDChallenge where she posts a challenge at the beginning of each month. Participants are encouraged to produce a specific type of plot that she chose for the month, using any software or language the participant prefers. I had been faithfully joining since April and I have learned a lot so far!
The challenge for August was a dot plot, and as per the terms I took my choice of dataset and made a dot plot from it. I have never made a dot plot before, and thankfully Python’s Plotly has a pretty good tutorial for making one. Pretty soon I was able to create version 1 of my dot plot.
I thought it was not a bad first attempt, and so I tweeted my plot to Cole hoping for feedback. As usual she did not disappoint!
Interesting data, Patricia. You may try ordering ascending or descending (I think it's reverse alpha now?). Also consider connecting points via a line or shading—will make year to year variance easier to see, I think. Thanks for creating & sharing! #SWDchallenge https://t.co/zGtzW4N2Tb
— Cole Knaflic (@storywithdata) August 6, 2018
Arranging my plot into the correct alphabetical order wasn’t too complicated, but the second suggestion of converting it into a connected dot plot stumped me. As far as I could tell, there wasn’t any option to make connected dot plots in Plotly just yet. I would have to improvise, which is the point of this whole post, if you can pardon my lengthy introduction.
I discovered that by combining Plotly’s dot plot and horizontal error bars, I can create an acceptable approximation of a connected dot plot.
import numpy as np import pandas as pd import plotly.plotly as py import plotly.graph_objs as go import plotly.offline as pyoff pyoff.init_notebook_mode() # Setting up the toy data for demonstration # It is important to create the differences here to specify # the length of the line connecting the dots later. toydata = pd.DataFrame() toydata['Category'] = ['Good', 'Better', 'Best', 'Awesome'] toydata['First Values'] = [3, 6, 2, 4] toydata['Second Values'] = [7, 8, 5, 5] diffs = toydata['Second Values'] - toydata['First Values'] # The actual code for creating the connected dot plot trace1 = go.Scatter( x=toydata['Second Values'], y=toydata['Category'], error_x=dict( type='data', symmetric=False, array=*len(diffs), arrayminus=diffs, color='gray', thickness=4 ), mode='markers', name='Second Values', marker=dict( color='blue', line=dict( color='rgba(217, 217, 217, 1.0)', width=1, ), symbol='circle', size=12, ) ) trace2 = go.Scatter( x=toydata['First Values'], y=toydata['Category'], mode='markers', name='First Values', marker=dict( color='red', line=dict( color='rgba(156, 165, 196, 1.0)', width=1, ), symbol='circle', size=12, ) ) data = [trace1, trace2] layout = go.Layout( title="Change in Values for Categories", width=600, height=400, paper_bgcolor='rgb(254, 247, 234)', plot_bgcolor='rgb(254, 247, 234)') fig = go.Figure(data=data, layout=layout) pyoff.iplot(fig, filename='blog-toydata')
The code above will result in the following plot:
Using the code above, I was able to turn my plot into a better one.
You can see my #SWDChallenge submission in the recap post.
Hope this helps when you make your own connected dot plot!