For week 16 we gave the #MakeoverMonday community A LOT of data to work with. 784 million records to be exact.
Why? Well, these were my reasons:
- It’s good to practice with datasets of varying sizes
- It gives people practice with other types of data sources. In this case they got to work with a live connection to EXASOL
- We used an interesting dataset full of information which people could either analyse at the summary level or drill into if they wanted to focus on a specific story
- The topic was very timely with research being published recently that was based on the data. Also, Jeremy Singer from Data is Plural picked up on the data
While I have noticed fewer submissions this week compared to some previous weeks, I have been impressed by their quality across the board. Many authors are really hitting their stride and getting not only into a rhythm for every week but also deliver consistently good quality work. That really excited me and I’m inspired by many of you, especially when it comes to design and finding those little golden nuggets in the data!
What else stood out this week?
- People’s enthusiasm. It’s encouraging to see and know that people are keen to participate and get their hands on the data as soon as it is released. Many also seemed excited by the opportunity to play with a large dataset. It’s something different and we enjoy giving everyone a chance to try something they don’t often get to do in their day-to-day work
- Finding YOUR story. Many of you focused on a specific topic or subset of the data to identify a story rather than trying to tackle the whole dataset all at once. I think that’s a great approach and judging by the vizzes this week, many found a topic that really interested them. It’s great that you then didn’t just present the data, but also added context and made your vizzes informative, so others could learn. Some really good examples include these dashboards by Natalia Dominguez and Daniel Caroli.
This week, like every other week, we’ve identified a few lessons where some authors went a bit wrong or seemed to have some challenges, so here are some recommendations on what to address and how.
LESSON 1: PROVIDE CONTEXT
With a topic like medical prescriptions, most people won’t actually have a lot of background or any expertise, so it is important to provide some level of context in our data visualisations.
Some people just publish a chart with a heading but no further information. If the heading is a question but the viz doesn’t contain an answer, that’s not very helpful. If the chart is minimalist but in no way explained, then for those not viewing the interactive version and just going by the screenshot, it is nearly impossible to gain any insights or derive meaning from the dataset. And if the dashboard focuses on a specific subset of data but no information is provided for the particular topic, it can be difficult to engage with the audience effectively.
- If you use a question in your title (and those are a great way to engage your viewers), then make sure the answer is either stated on your viz or very obvious to see from the data.
- Keep in mind that your audience very likely comes into contact with the topic for the first time when they look at your viz. Most people around you probably never gave much thought to medical prescriptions in England between 2010 and 2017. What can you do to make it very easy for them to understand the data?
- Text elements can help explain what the data is about and what your findings are.
- Big numbers can draw attention to something you want to highlight, just make sure you explain what those numbers represent
- Consider using hover actions and something like an ‘info’ icon as a way of providing more information and links to external sources. This way you help your audience without filling up your viz unnecessarily.
- Ask them! Show your viz to your partner, your kids, friends or colleagues and ask them whether they understand it.
LESSON 2: USE CHARTS THAT ARE EASY TO READ AND UNDERSTAND
Area charts with too many slices and line chart with too many lines can be very confusing. Do you need all those lines or could you simply use a couple of them instead? The same goes for bar charts with 10 different bars in various colours per pane. Can your audience actually gain insights and make sense of the data when it is being represented this way?
We often want to show A LOT of data at the same time and let our audience choose what they want to focus on, but this is not always the most useful approach. For us as authors it is okay to determine the focus of our dashboard and ‘remove’ everything else.
One example I liked this week was the viz by Shawn Levin. In his first version he had an area chart with about 10 different dimension members stacked on top of each other. My suggestion to him was to pick the top ones and group the remaining medications as ‘other’. He did that and I much prefer his second version, because not only does it look cleaner, but it also shows me more clearly how the top 5 medications compare to the rest of them combined.
- You don’t need to cover every aspect of the data in your viz. Feel free to narrow your focus and run with that.
- Make it specific and relevant and simplify your views where you can, so that your message stands out clearly. Nothing stops you from creating a second dashboard at a later stage if you want to analyse something else from your dataset.
LESSON 3: CHECK THE RANGE OF YOUR DATA
Whoops, this dataset contained data from the years 2010 – 2017. But there’s a catch: 2010 data started in August that year and 2017 data only includes January. This means we CANNOT compare the different years at the year level, i.e. we cannot say that prescriptions doubled in 2011 compared to 2010, because 2010 only contains 4 months of data, so of course it’s less than 2011. We can only compare data at the same level of detail when it comes to the dates. So if the data contains COMPLETE quarters, then it is appropriate to compare at the quarterly level. Otherwise drop it down to the month level.
- always check what the lowest level of detail in your data is. In this dataset we had complete months (as per the website where the data was sourced from: “General practice prescribing data is a list of all medicines, dressings and appliances that are prescribed and dispensed each month. A record will only be produced when this has occurred and there is no record for a zero total.”) but not all the years or quarters were complete
- compare your data ONLY at the level that is common to the range you want to include. If you want to compare years, then only include the complete ones, i.e. 2011 – 2016 inclusive.
- if something looks odd when you visualise it, it could be odd and it could be a story. Or it could be missing data. A huge drop in prescriptions or a huge spike from one year to the next indicate that you should have a closer look at the dates. If they are all comparable, then go further and see what you can find (e.g. a patent expired or medication was taken off the market, etc.)
So which vizzes stood out for me this week? Here are my 5 favourites…
Author: Louise Shorten
Link to come
What I like:
- Great design including different chart types and big numbers that demand my attention
- Very appealing colour choices that are consistent across the viz and work well together
- Text elements guide the audience and explain some of the key findings in the data
- The Title is at the centre rather than the top, which works because the map and black circle draw my eye to the middle. Adding the ‘information icon’ and calling out a key finding makes me curious to find out more and explore
- Chart titles are supported by horizontal lines which tell me what they relate to as they span the width of each chart area as well as divide up the dashboard visually
- Vertical arrows from each BAN (Big Ass Number) tell me where to look for supporting information and analysis
What I like:
- Very clean and simple design
- Interactivity allows me to filter the Top 20 drugs by 3 different measures, which updates the table and spark lines as well as changes the list of drugs included, so they are always genuinely the Top 20 of all prescriptions. The title also changes dynamically based on my selection, which shows great attention to detail.
- The spark lines add a visual element to the list but also show me something interesting, such as the seasonality of flu medication and that something significant happened with Atorvastatin in 2012
- I really appreciate the definition of the measures being provided at the bottom of the viz. And this is easy enough to do because they definitions were provided in the dataset as well as on the source website
- Nice tooltips – simple but well formatted
Author: Adam Crahen
Link to come
What I like:
- It’s a single chart but it tells a story and is designed very well
- The colour choices mean I know exactly what the focus of the viz is, with Apixaban ‘popping out’ while the remaining drugs are there for reference in the background
- The title tells me about an event and makes me wonder ‘what happened next?’
- Using the percent difference compared to April 2012 when the drug was approved enhances the story and puts the December 2016 figures into perspective, while the arrows point out what I should be looking at
- I appreciate the text box telling me what Apixaban is used for
- I like the image of the chemical structure which ties back nicely to the topic of drugs and chemicals and I imagine that a scientist coming across this viz would look at it twice because of the imagery included, so it gets people’s attention
Author: Pooja Gandhi
Link to come
What I like:
- A really well designed infographic style viz that tells me a story of prescription medication in England starting at the summary level and going into detail as I move down the page
- The bar chart / timeline across the top is not only a representation of the data but also works as a design element and visual divider
- The downward arrows remind me that the story flows from top to bottom
- As the viewer I get a high level perspective of the data with annual average costs but I am also informed about the categories with the most prescriptions and see how those prescriptions changed over the years with regard to their costs
- I like the bottom section which focuses on two specific medications which are related (in my lay person understanding anyway) and also relate to the last of the 5 categories, ‘Nutrition and blood’, so they tie nicely back to the previous section
- I also really like the colours. They are neutral but professional and the change between white and ‘pigeon blue’ looks really good
What I like:
- The news article style of this viz looks really appealing with the title reminding me of a newspaper headline. The background provided sets the scene for the data and not just informs me but draws me in because it provides some punchy statistics
- The colour scheme is nice and simple and while I usually move away from the default orange and blue when I create vizzes, I think it works really well for this tree map
- Speaking of tree maps. How did a tree map make my top 5? After re-familiarising myself with the regions in England, I realised how cleverly designed this tree map is in that the different panes correspond to their geographical locations. (check google maps if you don’t believe me)
- I like being able to display the labels for regions or towns dynamically to either have that context or focus purely on the heat map that is created by the tree map
- It looks simple but it’s very cleverly done and I love that kind of deception 🙂