Thursday, 12 September 2013

Data Visualisation Rules

Data-Driven Journalism recently posted a survival guide for data visualisation. I found it to be almost* comprehensive and decided to put it to the test on the following image.

The Monkey Cage, which has recently moved to the Washington Post asked whether it was the worst chart on record. I think there are a few points worth defending it on - and some points that aren't even mentioned. So here is the survival guide on charts with my comments in bold.

  • Avoid 3D-charts at all costs. The perspective distorts the data, what is displayed 'in front' is perceived as more important than what is shown in the background. I've written about this before and there are other bad things that it brings. 3D enables the data to be shown from an angle and this would mean that even if the figures were all the same they would not be perceived as being the same. As we go right the faces of the bars appear to get wider. As we go left we can see more of the depth by seeing the tops and right-hand sides of the bars.
  • Use pie charts with care, and only to show part of whole relationships. Two is the ideal number of slices, but never show more than five. Don't use pie charts if you want to compare values (use bar charts instead).
  • Always extend bar charts to the zero baseline. Order bars by value to make comparison easier. Now this is an interesting one. "Did not extend the bar to zero" was my first thought on viewing the chart but .. we are dealing with temperature. What is a fair zero? Both centigrade and Fahrenheit are options but would give different charts. The fairest of all would be Kelvin as it has a legitimate reason for its zero. This drives home to me the need for care when drawing charts with temperature - the global-warming debated is heated enough.
  • Use line charts to show time series data. That's simply the best way to show how a variable changes over time. This is spot on and can I add that the corollary should be that if you use years and don't use a line chart then it is not a time series. That was another criticism of the chart - that people would expect the years to be in order. I think if there are a limited number of years (6 is probably pushing the limits) then people will not be confused.
  • Avoid stacked area charts, they are easily misinterpreted.
  • Prefer direct labeling wherever possible. You can save your readers a lot time by placing labels directly onto the visual elements instead of collecting them in a separate legend. Also remember that we cannot differentiate that many colors. I think the placement of  labels and choices of colours are a real plus for the chart. 
  • Label your axes! You might think that's kind of obvious, but still it happens quite often that designers and journalists simply forget to label the axes. This cannot be defended.
  • Tell readers why they should care about your graphic. Don't waste the title line to simply say what data is shown. Should readers care about this chart. Maybe it is just a fun one to show how hot it is whilst hoping to avoid the flame warms of global warming.
  • My conclusion is that this a hard chart to be unbiased about because of choosing a zero temperature. It is probably something you don't even want to turn into a chart. The choice of 3D was a poor one but all the other choices led to a pretty-looking graph. Hardly the worst I've seen.

    No comments:

    Post a Comment

    Arrow Key Nav