We did it in week 6 and we’re doing it again in week 16: We’re giving you a big dataset to work with so you can flex your analytical muscles, try something different, find an interesting story or literally just throw some really big figures on your dashboard.

We (and by we, I really mean my colleague Johannes, whose support in this venture I greatly appreciate and who is just as excited as I am to see what you guys will do with the data) have prepared a dataset giving you GP practice prescribing data for the UK from 2010 – 2017. The full dataset is available for you to access on EXASOL.
The data contains ‘a list of all medicines, dressings and appliances that are prescribed and dispensed each month’.

Similar to week 6, here are some tips ahead of the data being published…

The data

 

  1. We will provide the data as a .tds file with a live connection to EXASOL, which means:
    • unless you have previously done so, you need to register for access to the database on www.exasol.com/dataviz (if you participated in week 6, your ‘old’ login will still work)
    • you need to make sure you have the EXASOL drivers installed for Tableau. They can be found here (go to Download ODBC Driver and select the appropriate file)
    • you download and open the .tds in Tableau and login with the credentials provided to you in the registration confirmation email
    • you benefit from a data connection, where we have joined the relevant tables for you, have formatted the fields and added field definitions where appropriate – easy peasy
  2. We will also provide an extract file as usual but in a packaged workbook on Tableau Public for you to download if you want to just use a subset of the data. The extract is limited to less than 15m rows to ensure you can publish your viz easily on Tableau Public. The data is limited to 2015 at a year level, which will still allow you to makeover the vizzes from the original article. Use this if you use the Tableau Public App.
  3. And if you’re desperate to get started today, you can simply register and build the connection yourself from the schema ‘PRESCRIPTIONS_UK’ (connection details provided in the registration email).
  4. If you choose this option, please use the below join conditions:
    • Join the tables ‘PRESCRIPTIONS’ and ‘CHEMICAL_SUBSTANCES’ on the field SK_CHEM_SUB
    • Join the tables ‘PRESCRIPTIONS’ and ‘PRACTICE_ADDRESS’ on the field SK_PRACTICE_ADDRESS
  5. Data Definitions
    • are provided in the .tds, .tde and the .twbx
    • can also be found in the Glossary of terms and the FAQs by hscic
    • Please read them, because this data isn’t trivial as it deals with people’s health and provides insights into the state of a nation’s health.

Working with Big Data in Tableau

 

  • before you throw 100 million data points onto a single map, add some filters. The rendering of the map is unlikely to happen and do you really expect Tableau to visualize that many data points on a single map in one chart? And do you need it? 🙂
  • familiarise yourself with the dimensions and measures for a little bit before you start your analysis, in case you feel a bit overwhelmed or don’t know where to start (my suggestion: use the ‘describe’ functionality)
  • don’t be afraid to keep it simple. Just because the dataset is massive doesn’t mean your dashboard has to be. You can tell a story about the dataset or about the content, stick to a high-level overview or go into the detail
  • this dataset has many potential topics contained in it. Feel free to go down a narrow road of something you find particularly interesting and see what you can find out. There are a number of findings published (e.g. here and here) which are based on this data, so dig around and create the next big story 🙂

 

Publishing your work

 

Please note that Tableau Public only allows you to publish a data extract with up to 15 million records. The .tde we provide works for that. However, you cannot publish a dataviz with a live connection to EXASOL.We provide the whole dataset purely for you to explore ALL the available data and to play with big data in Tableau.

Check out this great video Andy created to teach you how to go from the big dataset to creating an extract for publishing that will be small enough.

Once you finish your analysis and visualization of the live data, please tweet an image as you usually do as part of #MakeoverMonday (tag @TriMyData and @VizWizBI) to show us what you’ve created.

Tweet and blog about your #MakeoverMonday dataviz just like you usually do. Let us know how you went about it, especially if you used the full dataset. We love reading about people’s analysis processes and seeing how they go about their data analysis and dataviz creation.

We will showcase the best visualizations on EXASOL’s website.