Data Assignment 1: 1 Data Set, 4 Stories

Due Dates:
Rough drafts, Thursday, March 12, 11:00am
Designs handed in: Wednesday, March 18, 11:59pm — Turn-in link
Critiques: Wednesday, March 25, 11:59pm — more details and turn-in link

Objectives: To make some visualizations with real data, and to explore how to tell different “stories” by choosing different encodings of the data. This is a chance to try out using visualization tools. If you’re an 838 student, you’ll get to try your hand at dealing with a larger data set.

Overview

In this assignment, you’ll pick one data set to make visualizations from the list below. Then, you will make 4 visualizations – each telling a different “story” about the data. The idea here is that you should explore the different kinds of visualizations you might make from this data, and the different questions/tasks that you might want to show someone, and to see how you can match the picture.

For 638 students, you may pick one of the smaller data sets. We’ve gone through the work of “cleaning” it and getting it into an easy to work with form. For 838 students, you are required to work with one of the larger data sets.

You may use whatever tools you like to do this assignment. Excel (or some spreadsheet) is totally fine (and is probably what I would use for the 638 version of the assignment). You are welcome to try to use Tableau (see Canvas for how to get a free student version) – in fact, we’ll have an in-class demo to help you get started. If you want to write a program to process the data, that’s fine too.

For the rough drafts, sketches are fine – the goal is to get feedback from others.We encourage you to do a lot of sketching to try out different designs (even if we can’t give feedback on all of them).

For the final ones, you should make real visualizations with the real data.

If you find that you aren’t able to exactly implement your design (e.g. you can’t figure out how to convince excel to use the colors that you want), feel free to “cheat” a little (save the picture and open it in Photoshop and paint over it), but part of the idea is to try to make pictures with real data (so don’t just sketch – unless you are doing precise measurements). If you’re really stumped on implementation, you can put a note in your caption “the red dots were supposed to be blue” – but try not to leave too much to the imagination of the viewer.

On Thursday, March 12th, we’ll take some class time to look at people’s rough drafts, and give each other feedback. You only need to bring 2 rough drafts, but you might want to have more ready.

On Wednesday, March 18th, you’ll turn in your 4 designs.

On Wednesday, March 25th, you’ll turn in critiques of other people’s work (more details).

Note that for this assignment, your visualizations must be static (in a PDF). Not interactive. Not animated.

What to turn in

Rough drafts (3/12): bring rough versions of your designs to class on paper. (print them out if they are on the computer, but for the rough drafts, sketches are OK. You will need to bring at least 2.

Designs (3/18): For your 4 visualizations, turn in a PDF file for each one of them. The PDF should have the visualization and a caption (since a good figure should have a good caption). Although, if your graph is really great, the reader might figure out the story without reading the caption. Please do not put your name inside the PDF (so that we can send them out for anonymous critique). The PDFs should be 1 page each. it should be clear from the visualization and/or caption what data set it is. Turn in a 5th document that explains how you made the pictures, and what you were trying to show with each one. These will be turned in as an assignment on Canvas.

Critiques (3/25): We will give you a few of your classmates designs to critique. The goal is not only to give them feedback, but also to show to us that you understand the concepts that you’ve learned about in class and can apply it to critiquing others’ work. We have a post that details the rubric and the turn-in mechanisms here.

Grading

The overall project will be graded and worth more than 2 discussions. We will evaluate your designs (especially how well they show that you’ve thought about how to choose encodings to make a point). We will consider how well you choose the points you want to make (if you pick something that’s really simple to show, we’ll have higher expectations about what it means to show it well). Critiques will be a part of the final evaluation: both the quality of the critiques that you write, as well as what others think of your designs.

Your goal is to create visualizations that effectively achieve their goals (and make that goal clear). Evaluation will focus on this. For example, do you choose encodings and layouts that are both well matched to the data, but also expose the thing you’re trying to show.

Data and Example Questions

Try not to pick questions that can be answered with a single statistic – but something where the visualization adds value. The richer and more complex the task the story (or sets of stories) that the visualization tells makes it more interesting (and challenging), and gives you more opportunities to make a particularly cool “story”.

For example with the airline data (a month of flight delay information):

You could give the statistics on the average delay for flights leaving Madison
You could give the statistics on flight delays leaving Madison, helping someone choose which destination has the least delays, or what time of day you are most/least likely to have a delay, or some combination of both.
You could present information on a bunch of city pairs – for example, to help someone plan a trip between Madison and San Francisco, which hub city is it best to connect through? what time of day should you leave? (if your goal is to avoid delays)

We’ve picked the data sets (but you get to choose amongst them). You get to pick the stories to tell. Think about stories that someone would care about. Stories that would be interesting.

Data Sets

We want you to use one of the data sets we provide in Box. Finding good data sets takes time, cleaning the data set once you’ve found it takes time, and even deciding if the data set is sufficiently good to visualize takes time. We’d rather you use your time to devise visualizations.

Small Data Sets:

Simple commerce data (fabricated; thanks to S. Franconeri) – link
Metropolitan Area population change (2010 to 2011), US Census – link
- Original source: Census.gov (you’re not going to want to remake that vis!)

Medium Data Sets: (these are much harder to work than the small ones)

Government contracts (unspecified dates; thanks to S. Franconeri) – link
Federal government receipts (1962 to 2020, est.) – link
- Numbers in this data set are in thousands of dollars (e.g. 2 = $2k)
- It might help to read the user guide put out by the White House.
- Original source: The White House’s GitHub

Large Data Sets: (if you’re an 838 student, you must choose one of these)

On-time flight data datasets (December 2009, June 2014, December 2014)
- Contains flight information, leave/arrival times, leave/arrival locations, sources of delay, times of delay, tail numbers, diverted locations, etc.
- If you are interested in doing some sort of mapping, here’s a resource for federal transportation points of interest: National Transportation Atlas Database
- Original source: USDOT’s Bureau of Transportation Statistics (code lookups available here)
Federal government budget outlays and authority (budget authority)
- Numbers in this data set are in thousands of dollars (e.g. 3,453 = $3,453,000)
- It might help to read the user guide put out by the White House.
- Original source: The White House’s GitHub

Critique Rubric

We will provide a rubric for critiques of designs. Here are some thoughts as to what will be on it.

How good/interesting are the “stories” that you chose? Did you pick a diverse set?
Are the things you chose to show multi-variate?
How well chosen are your encodings? Are they effective at communicating the message?
How well “implemented” are the designs? Are the specific detail choices made thoughtfully?

3/17: We’ve posted a clarifying post on how the critique will work and the rubric of the four stories.

Data Assignment 1: 1 Data Set, 4 Stories

Overview

What to turn in

Grading

Data and Example Questions

Data Sets

Critique Rubric

This Week in Vis

Recent Posts

Categories

Upcoming Events

Course Infrastructure

Links