In this assignment, you’ll pick one data set to make visualizations from. Then, you will make 4 visualizations – each telling a different “story” about the data. The idea here is that you should explore the different kinds of visualizations you might make from this data, and the different questions/tasks that you might want to show someone.
Note: this “challenge” is broken into 4 smaller phases. The last phase is bigger than the first ones, so you might want to start on it earlier.
The purpose of this assignment is to: give you practice making visualizations that really show something (i.e., “tell a story”). We want you to see how with the same data set you can tell different stories by making different choices with the visualization. Therefore, you must pick one data set, and make visualizations from it that tell four distinct stories. That is, you will turn in 4(*) different visualizations (from the same data set), each that “tell a different story”. (you will turn in more than 4 visualizations, but that’s another story).
We will provide you with the data set. We will give you a few choices - see Design Challenge 1 Data Sets.
We would like to emphasize the use of “end-user” tools (such as Tableau). In fact, we will require you to at least try Tableau and use it for the early phases of the assignment. We discourage you from programming to make visualizations assignment (if you need to write scripts to process or augment data, that is more acceptable). You may use any tools that you like to create the visualizations (for phase 4) – subject to the constraint that you are required to hand in PDFs, and to document your process.
You will hand in real visualizations with the real data. If you find that you aren’t able to exactly implement your design (e.g. you can’t figure out how to convince excel to use the colors that you want), feel free to “cheat” a little (e.g., save the picture and open it in Photoshop and paint over it), but part of the idea is to try to make pictures with real data (so don’t just sketch – unless you are doing precise measurements). If you’re really stumped on implementation, you can put a note in your caption “the red dots were supposed to be blue” – but try not to leave too much to the imagination of the viewer.
We will grade your visualizations based on criteria detailed below. Generally, we care that you identify interesting things to show in the data (stories) and choose visualizations that effectively show those things. We care that you choose appropriate encodings, give good rationales for your choices, and that you take active steps to emphasize the message you are trying to make.
Part of this assignment will be peer critique. You will turn things in without your name on it (so we can do critique anonymously). We will then give everyone a few designs from other class members to write critiques of. You will be graded on your critiques (these will contribute to your grade, but will not be used directly in grading the author).
Using Tableau is part of this assignment, although you don’t need to use it for your final turnin. There is pedagogical value in seeing Tableau as it is a nice embodiment of some of the “theory” we discuss in class. See the class Tableau page.
Note that for this assignment you are to create static visualizations, with appropriate captions, as PDFs. Interaction is for later in the class.
You will work individually for this design challenge.
Deadlines / Milestones
Each phase is a “Design Exercise” - and is somewhat independent. The first 3 phases are graded on the Grading (Ungraded Grading Scale) and you will receive a letter grade for the 4th phase (the main handin).
- Phase 1: Design Exercise 1: Try Tableau! (due Mon, Sep 27) - You must be able to use Tableau
- Phase 2: Design Exercise 2: Use Tableau (due Mon, Oct 4) - You will get familiar with tableau
- Phase 3: Design Exercise 3: Explore Challenge Data (due Mon, Oct 11) - You will use Tableau (and maybe other tools) to explore the data you will use for the Design Challenge.
- Phase 4: DC1 Handin: Design Exercise 4 (due Mon, Oct 18) - This is really the “design challenge” where you turn in your final hand ins.
- Phase 5: (to be announced) There will be a peer-review component, which will be a separate design exercise.
Stories and Visualizations
The goal of this assignment is to create visualizations that “tell stories” from the data: that highlight / show something in the data. They shouldn’t just dump a bunch of numbers: they should make an intended message come out.
Try not to pick questions that can be answered with a single statistic – but something where the visualization adds value. The richer and more complex the task the story (or sets of stories) that the visualization tells makes it more interesting (and challenging), and gives you more opportunities to make a particularly cool “story”.
We’ve picked the data sets (but you get to choose amongst them). See Design Challenge 1 Data Sets for your choices. You get to pick the stories to tell. Think about stories that someone would care about. Stories that would be interesting.
A visualization should have a purpose - something that it helps the viewer see. This is the “story” that it tells. Visualizations that simply present the data (without something in mind) are not the goal here. “This visualization lets us see the distribution of grades in the class” is not as interesting (for this assignment) as “this visualization shows the correlation between how multi-variate a visualization is and the grade, letting us see that that mutli-variate visualizations are pretty much necessary for a good grade”.
Note: part of the assignment is to explore the data set to find the interesting stories to tell visually. You will probably want to use visualizations to do that! (more on that in a bit)
Part of the idea of this assignment is that by looking at the data in different ways, you can see different things in it. Therefore, we ask you to make 4 different visualizations, each telling a “different story” from the data (e.g., highlighting a different interesting thing in the data).
The data sets you may use are described on Design Challenge 1 Data Sets. You must use one of the data sets that we provide. You may process the data or “augment it” (get some other data that helps interpret the data - for example you may find a table that helps you translate from three letter airport codes to city name or GPS coordinates). But, in general, you should stick to the data provided. We will ask you
We care that you have a diverse set of stories: your visualizations show different aspects of the data.
For each visualization / story, we will check for:
- Is the question/story interesting and clear?
- Is it multi-variate?
- Is the design effective? (is it well adapted to the story/task?)
- Do the details represent good choices?
- Is the design appropriate for the data?
- Is the rationale properly stated (in the documentation)
- Is the design complete (it has enough of a caption that it stands alone)?
The best designs for this assignment are multivariate and specifically adapted to the task/story. They may use a standard design (stacked bar chart), but use good choices in the details (e.g., the ordering of the bars or the colors) to make the “answer to the question” easy to see. We look for signs of students making explicit good choices to make what they want the viewer to see easy to see. (you can explain your choices in the documentation)
We will look for diversity in the stories that you choose to tell with the visualizations.
Generally, we look for diversity in designs. If all of your visualizations are bar charts, that’s often a bad sign. Of course, if it really is the case that you have found four questions that are each best answered with various bar charts, that’s less of an issue. But in general, you may want to pick stories that are told with a variety of designs.
To emphasize: what we are looking for are what explicit choices you made to emphasize your “story.” A “data dump” (just making a chart of some of the variables) is not likely to get you a good score. If you make explicit decisions - selecting subsets or data, highlighting particular points, arranging designs that emphasize certain aspects, etc. - this will be rewarded.
How to Make Visualizations
In Phase 1 and 2 of this challenge we require you to use Tableau. We want you to at least experience what it is like to work with a state-of-the-art commercial tool. Tableau also embodies many important visualization concepts (data abstractions, explicit choices about encodings and transformations, automatic selection or chart types, …), so it is useful to see how it works. Underneath, Tableau is built on lots of the best research and implementation practice. And its automatic guidance tools embody a lot of design knowledge from expert practitioners.
We will provide you with access to Tableau (both online and desktop), and give you some guidance on getting started with it and suggest resources. We won’t “teach” you to use Tableau. See the Tableau page.
For your Phase 3 and Phase 4, you may use whatever tools you like. We encourage you to use Tableau, but we will not force you to do so. But, any tool is fine, as long as you can get your visualizations into PDF files. It is fine to use Excel or Tableau or JMP or some other “tool.” We discourage you from programming to make your visualizations.
If you are going to make standard designs, you probably should use standard tools.
You may want to use some tools to “explore” the data initially, and then different tools to make the specific visualizations that you want. (although, Tableau is excellent for exploration)
Note: you are turning in static visualizations as PDFs. You may use interactive tools to make them, but you need the visualizations to “tell the story” as a static picture.
You may “edit” the visualizations that come from the tools (but please document this). For example, you might make a picture in Tableau, and then load that into PhotoShop or PowerPoint in order to add captions or adjust colors.
You may find you want to do some data wrangling, data cleaning, or analysis before visualization. Try not to make this the main part of the assignment. Many of the data sets are “clean” enough that you can use them directly in Tableau. If you need to do some programming for this part, it’s OK – be sure to describe what you did, and turn in the programs that you wrote.
Phase 1 and Phase 2
Phases 1 and 2 were documented on separate web pages. See Design Exercise 1: Try Tableau and Design Exercise 2: Make some visualizations. These are basically “practice” for the main part of the assignment. For these phases, you were required to use different data sets than you will use for the main part of the challenge (3 and 4).
For the main part of the Design Challenge, you will pick one data set from a list of options (see Design Challenge 1 Data Sets). You will need to explore this data set to find the interesting “stories” in it, and then make visualizations to tell those stories.
For phase 3, we are checking on your process for doing this exploration. We simply ask that you show us progress in looking for stories.
The process of looking for interesting things in data is often called “Exploratory Data Analysis” (EDA). We don’t actually specifically talk about EDA much in class - but the foundations are the same. With EDA, you often make visualizations that allow you to quickly identify things that might be of interest, and then explore those things further. The emphasis is on quickly making visualizations (or statistical analyses) that expose things of interest.
For phase 3, we ask that you turn in two of the visualizations you made to explore the data. We don’t expect these to be as “good” as your final visualizations: these are visualizations that you make for yourself to decide what visualizaitons to make later. For example, you might make a chart showing all states (which is a pretty overwhelming bar chart with 50 columns) to explore what differences there are, and then make a more focused chart later that shows a specific interesting finding. The “data dumps” that are discouraged for Phase 4 are acceptable here: at this point, seeing that there is nothing interesting in a particular way to look at the data is OK.
Phase 4 is the “main” part of the challenge - in a sense, everything else is practice to build up to this.
You will turn in 5 visualizations, and supporting documentation. We ask for 5 visualizations - exactly 5. You may not turn in more or less.
We ask you to identify 4 stories - and make a visualization for each. For one of the stories, we ask that you make a second visualization that tells the same story (an “alternate”). This is why there are 5 required handins, not 4.
For each of the 5 visualizations, we ask that you turn them in as PDF files, without your name on it. This will allow us to send them to classmates for anonymous peer review.
Note: an important part of the visualization is your rationale for why you made it. What choices did you make and why. Often, this will be obvious from the visualization. But, you don’t have to leave things to chance that we’ll see it - you can tell us what you think you’ve made easy to see, and why you think that is easy to see (what about your design makes it so).
This “project” is presented on Canvas as a series of assignments, 1 per phase. As usual, we are trying to make Canvas do something that it isn’t.
We are trying to use Canvas Quizzes to have you hand in each phase. This way, you can turn in the (multiple) files you need to for each part, and we can structure your answers (to remind you what to do, and so we can read your answers more easily).
The downside of this, is that Canvas quizzes are not designed for big complicated handins. They are designed for quizzes that you take in one sitting. So, you need to fill them out in one sitting. We strongly advise you to write your answers off line (in a text editor) and copy/paste them into the answer fields. You can look at the quiz to see the questions.
We have no idea how Canvas will deal with grading. Hopefully, it won’t fight us too much.
Captions and added text
Captions (and other text) are an important part of visualizations, and tricky for this assignment.
The distinction between a caption (roughly, a short text accomplanying an image) and “in-line text” (text put into an image - such as a label, or a callout arrow) isn’t clear cut. So we encourage you to do both as needed.
There is also a tension between “your visualization should be so clear that the viewer has no problem seeing the right thing” and “a caption can help call attention to the right place and/or guide the viewers interpretation.” Don’t use captions as a crutch. If you write something like “Notice that the faint blue dot hidden behind the red square which means that”, you probably want to make this easier to see through visual design choices.
We ask that your visualizations “stand alone”. You should give a sense of what the data is, but do not need to document the data source. But there should be a clear title (that suggests the story), a caption, and any labels or in-line text that helps make the story clear.
After Phase 4, we will require everyone in class to write critiques of other student designs. Details of this phase will come later. But note: your “grade” for this phase is about the quality of your critiques, not of the designs you are asked to critique.
In the past, we’ve selected one design from each student and assigned each student 3 at random.
We look for signs of students making explicit good choices to make what they want the viewer to see easy to see. (you can explain your choices in the documentation)
Your “story” should be “something that is easy to see in the visualization”, and the assignment is about making explicit choices to make the story be the thing that is easy to see.
Think about (and write in your rationale): What does this design make easy to see? This should be a fact about the data (the “story” – or the finding of the story). The fact should be specific – not just “we can see variable X and Y” – but something like “we can see the positive correlation” or “we can see the pattern that there are more yelp check ins on weekends.” And it should be something that really is easy to see (look at your visualization!). Why is it easy to see? What choices did you make that makes it easy to see?