If you haven’t seen my “how to visualize” post, you might want to start there. Also, I usually do this example in lecture, so may skip it if you’ve seen it before.
I like to think of visualizations as being made up of four components:
- Data Transformations
- Layouts
- Encodings
- Interactions
Here, I give a simple example of how those get used to solve visualization problems. The key points are to give examples of those components, and to introduce the idea of thinking of the components as “redesign” choices that can be made to improve a design so that it better addresses a task. A side effect is that it shows that you can do useful visualization stuff with simple tools (this is all in excel).
The Task: I need to look at the grade distribution for my class and get a sense if it’s reasonable / fair. (more tasks may emerge as I look at the data)
The Data/Resources: I have a table of student names (not shown), scores and grades. It’s in an excel spreadsheet – so I’d prefer to keep things there. I don’t have very much development resources (this is often done at the grading deadline).
The Design: I have chosen to use a table, since it’s easy in excel. And it may be good enough (especially with design tweaks).
I am going to describe this as a “game” in the sense of something where we have a “current state” and at each “turn” I get to choose a “move.” The moves are to pick one of the 4 design components and change it.
I’ll start with an initial design – the first thing that comes to mind. Just look at the table in Excel the way the data comes to me.
This table has 57 rows (there were 57 students in the class). The last column is the grade I am going to give them, which is created by rounding the “Average” column (which is why the grade column is called “Rounded”).
For my task, this table isn’t great – I want to know things like “Am I giving enough As? Are too many people failing? What’s the median?” So, I need to improve my design.
Move 1: Layout – For my first design move, I’m going to choose to change the layout (the positions of the elements). Right now I’m using the position on the Y axis to tell me student number. This is useful if I need to find a specific student, but less good for seeing the grade distribution. So, I will change the layout: I’ll sort by the average.
An aside: this could also be seen as “interaction” – I want my vis to do two things, so I make it easy for the user to click a button (the column sort buttons in excel) to switch between the two different things we may want to do. A joy of excel, when you’re good at it, it makes these kinds of interactions easy.
After applying move 1, notice that the resulting vis is much better – I can quickly get a sense of how many As there are, how many failures, what the median is, …
Move 2: Encode – In the big table it’s hard to see the individual grades (if I shrink it to fit on my laptop screen). So, I’ll apply a color encoding – using excel’s color range feature – on the rounded grades.
Now, I think I’m starting to win the game. I can really quickly see the proportion of each grade this distribution is giving – even in the thumbail! It’s pretty clear that there aren’t many As, but there aren’t many failures either.
But, it does raise a new question task: are there students who are being hurt by the rounding? This is somewhat easy to check, since I can look at people near the borders. But I can apply a visualization design move….
Move 3: Transform Data – I can make a new “derived” column that is the difference between the rounded grade and the average. This directly measures what rounding does to people. And to skip a step, I can color code it: I’ll use a red/blue diverging scale (you’ll learn about these later, but they are built into excel). That way, red means someone hurt by rounding, blue is someone helped.
Now I think I’ve won this round of the visualization game. To do my task of getting a sense of who is most hurt by rounding, I can look for dark red – if I want to make it even easier, I can sort by that column! (use interaction)
And note: the reason I consider this “winning” is that I am able to do the tasks I need to do (get a sense of the grade distribution, see who is hurt by rounding). The visualization “game” is ultimately about tasks.
Lessons
Hopefully, you now have a sense of what the 4 design moves are. And you’ve seen how we can change our choices to make a visualization be more effective for a task. Along the way, you may have gotten a sense of how understanding task helps. And how effective designs can be simple – and done with existing tools.
Note that the details of the implementation didn’t matter. I could have done this with a different spreadsheet program, or if I had more time, I could have written a Javascript program. In fact the details are really irrelevant: there’s no use telling you which buttons in Excel to push, since this was done with an old version of Excel (the features are still there – it’s just different to get at them).