Given the hot and beautifully sunny days we’ve been enjoying in Bavaria (and the rest of Germany) lately, it seemed only fitting to look at some air quality data.
Andy picked a dataset containing hourly Ozone measurements for 766 US counties going all the way back to January 1990. Sounds like a lot of data? Well it’s 202 million record, which means we get to play in Exasol again because that’s how we roll!
Let’s start with this week’s makeover candidate. Everyone looking for it needs to actually generate their own chart, based on a date range and county or city selection.
I picked Clark County in Nevada, because I’m interested in analysing data from that area for my viz, so here is the original chart:
What I like about it:
- I get to create it myself by selecting date range and county, so I can tailor the view to what I’m interested in – within the limitations of what is offered
- The columns and rows are easy to understand: most people are familiar with the timeline layout for months going across horizontally and years going down
- The title reminds me of what I’m looking at
- The position of the colour legend is nice and central so I can refer back to it easily
What I don’t like about it:
- There is no additional context provided, but I’d like to get a bit more guidance on what the data is telling me – is there a trend over time?
- The colours don’t work so well for me. I see a lot of green and yellow but the unhealthy AQI results are difficult to spot because they usually relate to a short time period which is hard to make out in among the good and moderate AQI values
- The colour legend would work better with multiple columns and thicker lines for each colour so they’re easier to spot
- I like how leap years are identifyable by the slightly longer bars, but having them extend at the right side makes it look like December has an extra day, rather than February
- I’d prefer a bit more space vertically, it looks a bit squished
- I’d also like a reference line that tells me whether this county is within the same range as surrounding counties and how it compares to the national average
What I did:
- I decided early on to pick a single county. I don’t have any preference as I don’t really know where most of my US based Twitter friends live :-). So I chose Clark County where Las Vegas is located, because that’s where I’ll be heading for #data17
- Narrowing my analysis down to one county also allowed me to reduce complexity and find a focus quickly. I’m a bit strapped for time this week.
- I wanted to use the hourly measurements and I wanted them on a single chart
- I also wanted to show seasonality and whether ozone measurements are generally at risky levels or not
- My colours aim to reflect that good and moderate levels are the ones we shouldn’t be too worried about, so with the greys I wanted to remove the focus on those data points, even though they make up the majority of marks in the viz
- Lastly, I decided to put a bit of effort into conditionally formatting my tooltips, because I find that different categories in the data, especially when used on colours, should be reflected in the tooltips
Click on the image to interact