Last week I talked about the times that chart junk is ok for communicating information depending on the context. Today I want to continue this data visualization mini-series by looking at how chart junk can be removed to better follow Edward Tufte’s data-to-ink ratio principles discussed a few weeks ago.
The first example of transforming chart junk is of NFL football injuries from 2013 featured in the New York Times. I personally think this chart junk is a nice blend of art and data but let’s apply Tufte’s design principles nonetheless to make a plain version of the data visualization.
The after chart is a regular bar chart with the data-ink maximization principle in mind using Excel. Like in the example from the link, the labels could have been placed over the bars to eliminate the Y axis, but that is harder in Excel than in Python.
It is important to realize a few points: some information was dropped from the junk chart, such as the subcategories for certain injury categories (Head, Hand, Upper Leg) and the total number of injuries. Another interesting point to consider is that the image of the body part affected by an injury communicates a message in the junk chart that is missing from the plain bar chart.
The next example is called “What are the 10 Most Spoken Languages in the World?” found on Stats Chat.
The transformed chart follows Tuftes’ principles by: 1) removing the background with men embellishments, 2) remove redundant labels (‘million speakers’), 3) remove unnecessary borders, 4) reduce colors, 5) remove the special effects of word bubbles, 6) remove bolding or using font to communicate information, and 8) ‘Less is more effective.’ Exchanging the x and y-axes and editing the title help simplify the message.
As always, I’d love to hear your thoughts of the role of chart junk in data analysis and data science. What do you think of my example Tufte transformations?
Next week, I’ll talk about some of the D3 coding I’ve been doing this semester.