DC3: Design Challenge 3

by Mike Gleicher on September 13, 2018

Updates after announcement:

[November 16] – there is a no-cost extension to phase 1 through Thanksgiving.
[November 18] – please turn in any “documents” as either plain text or PDFs – please convert word files, markdown files, “webarchive files” to a PDF (we’ve had trouble reading some of these)
[November 18] – We cannot promise that we will be able to have demos for everyone (or anyone).
[November 25] – clarifications to final handins
[November 25] – A task list is posted here
[November 28] – A reminder to give attribution to code used
[November 28] – reminder to upload one file

“Spaghetti plot” is a pejorative term for a multi-line line graph that has too many line for people to see everything in them. Despite their problems, they are ubiquitous. In this design challenge we will try to understand them, figure out what they are good and bad for, and (hopefully) come up with something better for the places where they do not work.

In a nutshell…

November 16 – DC3-1 Task Analysis and Critique Plans, Situations and Tasks – You will turn in a list of tasks and situations where this applies.
November 23 – Thanksgiving – no project component due.
November 30 – DC3-2: Designs and Plans Designs and Plans – You will turn in designs for solutions, as well as describing what else you will do (make a tool, provide data sets, provide experimental designs, …)
December 7 – DC3-3: Drafts Drafts – You will turn in a draft of your assignment so that we can get a sense of what to expect.
December 14 – DC3-4: Final Handin Final Submissions – You will turn in what you final product is. This deadline is fairly firm since we need to get grading done before grades are due.

Contents

1 In a nutshell…
2 Background
3 The structure of the assignment (or why is this so complicated?)
4 The end result
5 Data Sets
6 Comparison with prior years…
7 The parts of the assignment
8 Requirements / What you’ll turn in / Assessment
- 8.1 Some thoughts on grading…
9 Ground Rules and Starting Points
10 Working with a Partner
11 Some Other Resources
12 Things that went wrong in 2017

Background

This assignment is about multi-line, line graph data. But a multi-line line graph is just a visualization of the particular kind of data that we’re interested in. But, it’s easier to say that than “data sets with a nominal/categorical set of items each with a quantitative (interval/ratio) dimension where at each sample there is a quantitative (interval/ratio) value. To make our lives a little easier…

There is a set (potentially a nominal/categorical set – but it might be ordered) of “objects” – these are the things that we have a line for. (we have N objects)
Each line covers a range (for purposes of discussion, let’s call it “time” – even though the dimension can be any time). For this assignment, we can assume that each line has the same start and end “time”. Even though this is a continuous dimension, we’ll assume we have a uniformly spaced set of samples (so we can simply refer to them by integers), without any gaps. (we have M samples)
At any “time” (for any one of the samples on the line), the line has a “value” – which is a quantitative value within some range.
We’ll ignore the fact that the “time” dimension is time (since it isn’t always). This means that there aren’t “obvious” cycles that we need to account for (like seasons, or day/night).

For example…

We may have climate data. For 50 cities around the world (N=50), over the course of 3 years, we have measured the temperature each day (M=365*3). For each of those 50*365*3 observations we have a temperature in degrees Celsius (in the range -20 to 50).
We may have sales data. For 100 products (N=100), over the course of 10 years, each month (M=10*12) we have the number of items sold.
We may have noise data on a train. For each of the 12 cars on a train (N=12), over the course of the 100km route (measured every km) (M=100), we have a measure of the noise level.
We may have data for trans-ocean cables. For each of N cables, Over the length of each cable (M), we have a measurement of the cable’s depth.

Notice that there are 3 types of scale we need to contend with:

The number of “lines” (N)
The number of samples of each line (M)
The dynamic range of the values (if the values are over a very wide range they can be harder to show)

If all three of those are small (N,M, range), then it’s easy – you can use a multi-line graph. But as N, M and/or the range grows, the problem gets trickier. And that’s where the assignment comes in. There are three obvious designs:

Spaghetti Plot: showing multiple lines in one graph – usually differentiating the lines with different colors. (it’s called a spaghetti plot because with many lines it becomes a tangled mess). It’s a position encoding for each of the quantitative dimensions, and color (or some property of the line) for the categorical one.
Small Multiples: showing individual line graphs – probably making each one small, which is why it’s called “small multiples”. With many lines, it becomes big (or has very tiny lines). This uses a position encoding for all 3 dimensions (we use position to encode which line).
Encoding the value with color, rather than height, so that each line is a strip of color. A stack of these strips of color has been called a “Lasagna Plot” (see resources) – I’m not sure if this term has caught on, but I like it and will use it. This uses position for the “time” dimension and the item, but uses color for the value.

By “visual design” I am referring to encoding. There may be other encodings (can you think of some? that’s part of the challenge here). Here are those three designs – generated by the simplest program that I could write, all using the same “fake” data. (you can check out my simple implementation. (described below). You can also try out Florian’s D3 example.

For each encoding, there are lots of minor variants. You can add interaction to highlight on element in a spaghetti plot; you can scroll and filter to select within small multiples; you can change the colorings of a lasagna plot, … Given how common this kind of data is, you’d expect there to be good solutions. Or at least a well-characterized space of design decisions that gives guidance of how to make informed choices. But I am not aware of any in the literature. So, we have to work it out ourselves in this assignment.

There is a fourth complexity (beyond number of lines, number of samples/line, range of values): if the samples aren’t uniform, there may be uneven gaps and different spacings. Imagine that for each different line, we get the measurements on different days… This is an additional type of complexity that you may optionally consider)

The structure of the assignment (or why is this so complicated?)

There is a broad range of skills and interests in the class. I want to keep my promise of “Programming Optional” – and even for people who want to program, there is a wide variance – I don’t want this just to measure how much experience programming you’ve had before this class. The main objective of the assignment are for us to use a very common and standard chart type to explore visualization principles and design. It’s a real problem that lots of people have, and I don’t think there is consensus in the literature on how to address these challenges. In fact, I don’t think there is much literature on the problem. People just do the basic stuff. However, I want to give people who want to learn about implementation a chance to try it out. Or, if you are a good programmer, you might think it’s easier to explore a design idea with code. Or if you’re interested in experiment design, you might want to do some of that. Or… So, I am giving you a lot of choices. I may regret it. And I make no premise that the hardness / amount of effort will be balanced. Your assignment must excel in at least one area – and you can choose which one it is. However, you must do at least some of each part of the assignment.

Task / Design Analysis – You must think through what people do with this kind of data, and what they want from their visualization tools. Then you will look at the available visualizations and see how well they work for the different designs (both the already known ones, and new ones you will create).
Novel Designs – You need to come up with solutions to some specific use cases (or general cases) with “new designs.” This might be some radically new visual encoding, or maybe just some tweaks on an existing design (like coming up with a clever use of interaction).
Implementation – You will need to show off your designs. This might be building an actual system that lets people try it out. It might be a set of hand-drawn sketches. There are lots of things in-between. Some notable categories (and these aren’t hard distinctions). Also the names are somewhat arbitrary.
- Tools – are programs that can read in new data sets. They will let a user (or a grader) try out your designs with their own data.
- Prototypes – are visualizations that are created with real data (like a tool), but only for a fixed set of data. For example, if you write a program that only works with a few data sets, or manually draw a picture from a data set. If the grader can’t try their own data set, then it’s not a tool – but a prototype.
- Sketches – don’t use real data. They approximate data, or use fake data. Usually you do this if you’re drawing something by hand, but you can imagine a program that draws a picture of what a visualization would look like without actually computing it from data.

The end result

The big piece of this assignment is to develop a set of designs for the data type (sets of series data) – basically designs to improve on the spaghetti plot.

All assignments must provide a mix of design and critique. Each person (or project team) must provide two designs, and for every design, there must be a critique that (at least) identifies tasks that it is good and bad for, and compares it against the 3 simple/obvious designs above. However, beyond that, we give you the choice whether to focus on developing implementations, coming up with novel designs, or being more thorough in your analysis.

Everyone must do some of each of the 3 aspects (design, critique, implementation): but you can choose how to do some of each. A good project might pick 2 designs, do amazing interactive implementations with real data, and do a basic critique of each (comparing them to the baselines and identifying tasks they are good/bad at). Or a good project might explore a wide range of designs, only implement them with hand-drawn sketches, and provide a thorough critique and analysis (not only of each design, but to assess how the set of them address different tasks). Or it might mix something in-between.

You must come up with at least two designs. Your designs should not be “too similar” to the designs we’ve given you (in either the baseline examples, or the “some more resources” papers below), or too similar to each other. If you’re concerned that a design might be considered too similar, you can explain in your description why it is different (and motivate those differences) in your description and critique.

If you are focusing on providing a breadth of designs (rather than implementations of specific designs), you may use some of the alternate designs discussed under “some other resources.” Be sure to clearly label these as not of your own invention. You can’t count these among the minimum two (and you probably should have more). But, you can provide a critique, a discussion of how they address the tasks that you consider. Alternatively, you can use these known designs as additional baselines for your critiques.

To come up with designs, consider the “4 design moves” for How to Vis to try (transform data, change encoding, change layout, add interaction). You need to give rationale for your design as to why you think it might be a better alternative for the scenario than the basic designs.

Expectations clearly vary based on what you do: tool building is time consuming, so you may do less of other things (including having complex designs) – if you’re “just” sketching designs, we would expect you to excel in other ways. This makes evaluating the assignments really difficult – and planning ahead for it hard. I’ve seen broad ranges of assignments (for similar assignments in the past): “programmed implementations” range from the minimal amount of code to get a basic graph up using a standard Python or R library to complete systems that someone would actually want to use. I’ve seen some amazing “sketches” – like one where a student showed the interactions in their design as a storyboard made as a series of photographs of a white board with post-it notes moving around.

We will provide some basic implementations of the basic designs. You can use these as a starting point for your own implementations (be sure to give proper attribution!). You can use these to make pictures that you draw on to make sketches of fancies designs. You can use these to see how bad the basic designs are for the problem you want to solve (assuming you create some test data). You can ignore them.

If you choose to program… we don’t care what tools you use. You must be able to give us a demo – so make sure it runs on your laptop. We may or may not be able to run things (even if you use tools that we have, there may be version issues or setup issues or …). That said, you must give us everything we would need if we wanted to try to run your program – including a list of things we need (e.g. libraries and other setup) and instructions. You also need to give us pictures of what your program looks like when it runs.

If you choose not to program, we’re still curious how you made stuff. Usually we can figure it out. But if you drew pictures with a drawing program, or generated things using some tool like tableau and embellished it or … let us know.

Data Sets

Especially if you program, you will want to have good example data to show off your designs. Coming up with example data can be part of the challenge.

We are only providing very limited sample data (see here. The generator is part of the baseline code repo (here).

Finding good example data is part of the project – as with anything else, you can decide how big a part it is. You might want to invest energy in making a really cool example data generator (that generates synthetic examples with different properties so you can check that your visualizations show them), or you might want to find interesting real examples (please, only ones that you can share publicly)

If you do come up with good example data, please let us know (as early in the project as possible) – especially if you’re willing to share it with others. We will reward people who come up with example data interesting enough that we try to share it with the class.

Comparison with prior years…

We gave a version of this assignment last year. It was design challenge 2. You’ll notice that the example code/data repo on GitHub says “2017-DC2”. The assignment was revised for this year – the basic problem is the same, but we have given a lot more specifics to what we ask you to do. Also, this year, we’re doing it later in the semester, so that you’ve learned more visualization concepts to apply (for example, the “too much stuff” readings and lecture).

The parts of the assignment

Note: the different parts of the assignment connect. So when you’re doing an earlier part, think ahead that you will be using these results as the basis for the latter parts. For part 1 each person must work alone and submit their own work. For later parts, you may work with a partner.

We will give a check/no check grade for each phase – but we will consider all phases in assigning the final grade.

If you work with a partner, you must tell us your partner in phase 1. You may not pick the same partner you worked with in Design Challenge 2. Both partners will get the same “score” for the main parts of the project, but it is possible to get different grades (since the individual parts of the project also count).

Phase 1: Task Analysis and Initial Critique

DC3-1 Task Analysis and Critique – Due November 16th

There are two parts to this: a task analysis (which has two parts: situations and tasks), and critiques. For the deadline, you need to turn in 3 different documents – one for each.

Task Analysis: You need to come up with lists of:

(at least) 3-5 concrete situations where this kind of data comes up. I gave 4 above (city/temp, sales, train noise, cables) – don’t pick those. Describe how these problems might scale (will they get large in N,M, or range?). If you can identify real, publically available data sets, that’s great (but not required).
(at least) 7-10 tasks. Describe them both in terms of a specific situation but also in a more abstract way (examples below).

I’ll give you a few to start with… you cannot use these in your list (for part 1, you can use them in your list for part 2). I’ll describe them in terms of the cities/days example above (these are the “concrete” examples).

On which day was the greatest range in temperatures seen?
What city had the widest range of temperatures?
Was there a month in which a city had its temperature rise consistently?

For abstract descriptions, these might be:

Identify the sample (time) with the greatest range in value.
Identify the line with the greatest range in value.
Identify a consistent increasing trend for a line within a time range.

Note: there is a hard cutoff for this part. After the cutoff date, we will share lists – so you can take ideas from others for your parts 2-4. You will turn in this assignment as 2 lists as a Canvas assignment. Note: keep the future phases in mind as you do this.

Initial Critique: We have a few basic designs [Spaghetti, SmallMultiples, Lasagna], and a few tasks (from your lists that you’re handing in).

In this part, you need to pick a few “tasks” and critique each of the three basic designs [Spaghetti, SmallMultiples, Lasagna]. For each task, explain why you think each design may or may not be appropriate (consider how they might scale as N, M, or range scales). You may wish to sketch out what something might look like to better explain the pros and cons. Remember, that a critique isn’t just to say what’s wrong – it’s also to say what’s right. Hopefully, you can identify some things the basic designs are good for as well as some things they are not. You will need to have some situations where the basic designs aren’t good for the later parts – where you will need to come up with something better.

In doing your critique: critique the design (e.g., spaghetti plot), not necessarily the simple example implementations given above. Although, you might use this as an opportunity to point out design elements that you need to get correct if you were to implement one of these baseline designs well.

Working with a partner: You need to do Phase 1 by yourself. You can work on later phases with a partner. However, you must tell us as part of Phase 1 who your partner is. (normally, we’d do this in week 2, but there is no handin for week 2 because of Thanksgiving)

Please hand in the 3 parts of Phase 1 as 3 separate files (concrete situation descriptions, task lists (with both concrete and abstract versions), and critique). We will consider the quality of what you hand in as part of your overall grade. (you won’t get a separate grade for the phases in this project).

Note: If you are working with a partner, please let us know (in the typein box for the submission), but you still must turn in your own assignment.

This will be turned in on Canvas at DC3-1 Task Analysis and Critique.

Part 2: Designs and Plans

DC3-2: Designs and Plans – Due November 30th

Pick at least 2-3 scenarios where none of the existing designs seem to work well. (by scenario, I mean a combination of task and data scale / data properties). Come up with some alternative designs that might plausibly provide a better solution than one of the 3 basic designs. (note: expectations for pairs are higher) You need to come up with at least 2 designs. Depending on what you do with the designs (later parts), you may need to come up with more.

For this phase, everyone (pairs can turn things in together) must turn in at least 2 designs, and at least 2 scenario descriptions. Our goal in this phase is to make sure that you’ve at least thought about possible designs that address problems. It is best if your designs address the scenarios.

In later phases, you will be asked to turn in more for each design. But for this phase, a sketch with a brief description will be sufficient. We aren’t going to evaluate your designs – we’re mainly checking that you’ve at least started to think about things.

The designs you turn in for this phase do not have to be the same ones you turn in in later phases (if you come up with something better).

We’d also like a description of what you are planning to do for the final turnin. If you’re programming, let us know – especially if you will have an interactive system that you’ll want to demo. If you have data sets you want to work with, let us know. These plans can change, but it will give us an idea what to expect later.

Phase 3: Drafts

DC3-3: Drafts – Due December 7th

We would like to see a “draft” of what you intend to turn in the following week. These can be rough – sketches of the designs you’re working on, screenshots of your program, … Our goal is to check to make sure that each group is making progress.

If you have found good data for testing your system, please let us know – particularly if you are willing to share it with your classmates!

Part 4: Final Handin

DC3-4: Final Handin – Due December 14th

The official deadline is December 12th, the last day of class. University policy does not allow for assignments to be due during the summary period. However, all students are given a no cost extension to December 14th – so really that is the deadline. If you prefer to follow University policy you are free to turn in your assignment on the 12th.

Note 1: because we need to get grading done promptly, this deadline is fairly tight. There will be penalties for late assignments. We may not be able to take late assignments after December 17th. You have 5 weeks for this assignment – an extra day or two shouldn’t make a big difference.

Note 2: if you want us to see your program in action, we will need to arrange a demonstration. You should assume that we will not be able to run your code (so give sufficient descriptions and screen shots – you could even provide a video). However, seeing a live demo is often the best way to appreciate things, so we will make arrangements for this. If you want to do a demo, please make this clear in the project hand-in, and look for instructions on the class website. Note: we may not be able to do demos (the logistics of demos during exam week are challenging). Please make sure that there is sufficient information (screenshots, descriptions) that we can figure things out even if there is no demo.

Note 3: You are responsible for finding good data to test your implementations on. We will try to find some (and also try to share ones that students tell us about), but you cannot count on being provided with data that show off your system.

Note 4: If you believe that Canvas is not an appropriate mechanism to turn in your assignment (e.g., you want to turn in big files or provide us access to a system via GitHub or a Docker image), please make arrangements before the handin date. You should still turn in the PDF documents by Canvas.

Note 5: There should be at least one PDF file (see #1 in the list below). Make sure its obvious which one we should start with.

For your final handin you should include:

A summary document describing all of the things you are turning in, with a list of all the designs you are considering. As part of your summary, you should include a comparison of the different designs you provide. There needs to be a single summary document for the assignment, and it should be obvious which one there is.
For each design that you consider, complete documentation including:
- A description of the design, including pictures. If you turn in a program, you still need to provide pictures (screenshots). If its hard to convey what is happening (e.g., you have interaction or animation), do as best you can with pictures – but also describe your demo (and possibly provide a movie).
- A discussion of the rationale for the design explaining why you created it. In particular, what tasks motivated it. You should describe a scenario for where the design is appropriate.
- A critique of the design, including a description of what tasks that it is good and bad for. Provide specific discussion of why your design is (or is not) adapted to the tasks.
- A comparison of the design against the 3 baseline designs (this can be part of the critique and/or motivation). If you want to make the critique aspect of your project stronger, you might consider more than 3 baselines (such as other designs in the literature – see the “some other resources” below)
If you programmed an implementation, whatever we might need to run the program. This includes source code, but also instructions (both on how to get set up as well as how to use the program). Remember, we may not be able to see your program in action so please make sure there is sufficient screen shots and description.
Any example data that you have to show off your implementations. Please provide a description of what the data is.

For parts 1 and 2, please provide PDF files. It is up to you if you want to provide separate documents (e.g., one for part 1 and one for each design), or one big document.

For parts 3 and 4 (if you have them): please provide a single zip file. Please put the documentation for how to use any software in this file.

Please upload one file to Canvas for parts 3 and 4. If you have more than one PDF, you need to make a ZIP. If you upload multiple files, we may not see all of them. Similarly, we may not see submission comments.

Requirements / What you’ll turn in / Assessment

There is a deadline for each phase. You will receive a score for each of the initial phases, and an overall grade (there is no separate phase 4 grade – the scores for the initial phases will be factored in to create an overall A-F grade for the whole thing).

Some thoughts on grading…

Grading will be subjective and arbitary 🙂 This assignment gives students a lot of freedom to choose how to excel. Excellence will be rewarded with a good grade. And there are many ways to excel. We also appreciate that there are tradeoffs – if you put a ton of implementation effort into making a robust tool that anyone can use with their own data, we understand that your designs might not be as far out, or you may have fewer designs. If you come up with a wide range of really creative designs for different tasks with really detailed sketches and rationales, maybe you won’t have as detailed of a evaluation plan – or any programming at all. However: all assignments must do some of all parts. You must come up with tasks/situations, create designs that assess them, communicate them (either as a program or pictures), give rationales for them, and have some sense of how to evaluate them.

Ground Rules and Starting Points

We will provide sample data as CSV files. In these files, the lines will be in columns, and each “time step” will be in a row. The first column is an X value. You can ignore it (they will always be consecutive), but you may want to use it to properly label the axes. We will provide example solutions for you to try. They will be simple implementations of the basic designs. You may use them as starting points for your implementations, or simply to get a sense of what the basic designs look like on different data sets. Be sure to give proper attribution for any pieces that you “borrow” to start off with. It is OK to take our initial implementations – or the D3 examples – as starting points, but you must identify this and give proper attribution. I made the most brain-dead simple implementation of the basic designs in Python using matplotlib. It looks terrible, and only does the simplest thing. But it can give you a starting point. You can get it on GitHub. I have made some sample data for you to start with – it’s random noise (some of it is more structured so there’s something to look at). The code to generate it is in my python repo, but you can get pre-computed sample files on GitHub. Florian wrote a basic implementation using D3. (github repo, running online version). This also serves as a starting point. But you can also look for the D3 multi-line line graph example.

You may use publicly available code (e.g., D3 examples, or the example assignment code) as starting points for your implementation – however, you must give proper attribution (explain what you borrowed, and where you borrowed it from, and probably what you did to it).

Working with a Partner

We will allow you to work with a partner for this assignment.

You must identify your partner when you turn in Phase 1. Both partners need to turn in phase 1 independently, but also say who their partner is in the type-in.
Both partners need to agree to work with each other.
Both partners will get the same scores for the later assignment phases – we won’t try to attribute the work to one person or the other.
You only have to turn in 1 “writeup”. We expect more scenarios to be considered and designs to be generated. We expect implementations to be better.

Some Other Resources

The paper on climate model comparison gives a great example of coming up with a clever design for a spaghetti plot by really thinking through what their users need. Dasgupta, A., Poco, J., Wei, Y., Cook, R., Bertini, E., & Silva, C. T. (2015). Bridging Theory with Practice: An Exploratory Study of Visualization Use and Design for Climate Model Comparison. IEEE Transactions on Visualization and Computer Graphics, 21(9), 996–1014. http://doi.org/10.1109/TVCG.2015.2413774
My paper on comparison may give you ideas on how to think about the problem in a structured way – for defining tasks and considering design alternatives.
The original paper on Lasagna Plots (where the name comes from):Swihart, B. J., Caffo, B., James, B. D., Strand, M., Schwartz, B. S., & Punjabi, N. M. (2010). Lasagna plots: a saucy alternative to spaghetti plots. Epidemiology (Cambridge, Mass.), 21(5), 621–5. http://doi.org/10.1097/EDE.0b013e3181e5b06a
Our paper on tasks for different designs for line graph data (trying to get at what colors are good for and not)Albers, D., Correll, M., & Gleicher, M. (2014). Task-Driven Evaluation of Aggregation in Time Series Visualization. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems CHI Conference, 2014, 551–560. http://doi.org/10.1145/2556288.2557200

Things that went wrong in 2017

Last year, we saw many great ideas for the problem – some were very creative. Many were well thought through (even if the design wasn’t great, there was a good analysis). However, there were some disappointing projects as well.

Here are some common mistakes that students made in a version of the assignment in the past:

Make sure that your designs really consider the data type (a set of series of values, where each series is a mapping from a continuous range to a continuous range – or a discretization of those ranges).
Be sure to explain your designs (if they are not clear). Give examples of their pros and cons (you can compare with the baseline). Make clear what kinds of tasks the design is good and bad for.
If your design is specific to a particular kind of data, be sure to be clear what that data is and what assumptions you are making. If you work on a specific data set, describe abstractly what other data sets things would work on.
In general, we prefer designs to be as general as possible.

An example: a student had a design that was a series of world maps. It was not obvious to the graders why a map could be used to encode a set of series. However, they could have clearly documented that they assumed that each series corresponded to a country (so, they had measurements for different countries over time). In this case, they could describe the limitations (must be countries, doesn’t scale beyond the number of countries, …)