This week’s Makeover Monday topic came about when Tom Brown introduced me to Aisling Roberts, author of the article Gender Pay Gap Reporting 2018 – What Difference Could it Make? Aisling pointed me to a dataset on gov.uk that contains all companies in the UK that were required to reports their pay information.
I warned everyone that this data set was tough, particularly because there was no information about the number of employees, therefore you could not created weighted calculations to get completely accurate numbers at an aggregated level. On the data.world post, I did my best to explain the calculations with an example and I also went through them on Tuesday’s Viz Review, hopefully that helped.
LESSON 1: IDENTIFYING PROBLEMS IN THE DATA
While exploring the bonus gap data, Graph Hopper (no idea what his/her real name is) created this simple scatterplot comparing bonus rates for men and women across all companies.
They noticed this really interesting stripe across the dataset in the opposite direction to the rest of the data in the graph. Investigating a bit farther, they found that for all of those companies, the bonus gap adds up to 100%. Given how the data was supposed to be reported, this should be very unlikely. Companies are supposed to report the percentage of men and women that received a bonus, not the split across men and women.
Hopefully the government will spot this as well, or better yet take Graph Hopper’s work, and make those companies correct their reporting. Bad data causes the overall numbers to be incorrect, which could lead to incorrect analysis and assumptions. Graph Hopper’s simple, yet effective analysis demonstrates the exploratory skills and skeptical nature with which any data analyst should approach a data set. Never take data at face value; take time to verify its accuracy and determine what to do with the discrepancies you find.
LESSON 2: USING COLORS THAT AREN’T GENDER BIASED
In the past when we’ve provided data with a gender breakdown, we’ve seen lots of charts using pink for females and blue for males. While this may be intuitive for many, it’s also a gender bias that can easily be avoided. This week, very few visualizations used the pink/blue combo. Maybe the lessons are sinking in? Here are a few examples of alternative color palettes.
The idea is to create a clear distinction between the two genders. Each of the examples above show that removing the gender bias moves the focus more towards the data and the analysis.