# Design Challenge 2: A Visualization Project (Hard Vis Problems)

Page content

For Design Challenge 2, I wanted to give people a hard problem that I don’t know the answer to. In the process of deciding, I came up with 3 possible projects. Rather than me picking, I am letting each student choose for themselves.

With each, I am giving a problem that I am not sure of the solution. Your task is to explore the problem in some way and come up with some solution(s). All of the problems are open ended. There are existing solutions for special cases in the literature. But the problems leave a lot of room for creativity.

The assignment is intentionally losely specified to give students flexibility. If there is something you would like clarified, ask (preferably on Piazza). Expect updates as the project goes on.

Updates:

• December 9 - Information on final handins - see DC2 Final Dates and Presentations for timing and presentation info, and DC2 Final Handins for final handin format.
• November 18 - No self-signups for readings
• November 18 - Some of the deadlines have been postponed (Phase 2 and 3 are now on Wednesdays)
• November 16 - Phase 2 describes project proposal details
• November 12 - the Design Challenge 2 Example Data page is live.
• November 12 - some clarifications on what to do for Phase 1 are in the Phase 1 details.

## Basics

While I may describe problems in terms of a sample data set, your real goal is to come up with general solutions.

In all cases, your project is to come up with a general tool/design that can work for many datasets. You will have specific data sets for testing, but in all cases, you will be evaluated on the generality of the solution.

The three options are:

1. A tool for exploring the subgroup structure in population data DC2 Option 1: Sub-Grouper
2. A tool for examining the distortions caused by dimensionality reduction DC2 Option 2: Dimensionality Reduction Explorer
3. A tool for drawing charts at small sizes DC2 Option 3: Tiny Chart

No promise that these are of equal difficulty. Indeed, even within any option, there is a range of things that someone can do. We will try to correct for difficulty and effort: a less ambitous project done really well is as welcome as a more ambitous that is less complete.

The “choose your own adventure” nature of this assignment will make it very difficult to evaluate. It also makes it very difficult to describe the assignment. I want to give people flexibility to do interesting things - the difficulty in describing and evaluating the assignment is the cost I pay. I think the answer is interactivity: take the time to ask and discuss things with the course staff. We will provide lots of opportunities to discuss ideas along the way.

The assignment has 5 phases - each one 1 week. Each phase has a milestone / hand-in. These are explained later.

Each student must choose one problem to work on (subgroup, dimensionality reduction, charts).

We will allow for work in teams, but some aspects of the project must be done independently (see below).

## The Design Challenge / Project

The problem descriptions are intentionally vague and broad. I do not want to prescribe what you must do. Not every project will consider every task. I am hoping that we will see a range of creative designs. I suspect there are many clever ways to use visual design and interaction to address the kinds of problems in working with this data.

By tool I emphasize that I am not asking you to make visualizations from a specific data set. But rather something that could be applied in various ways to various data sets. I do not just want pictures made from the data we provide. A tool might literally mean a system that can read in data sets and let the user explore interactively. Or it might be a visual design for a new visualization that can be used to show data in ways that address the task.

The important result for this challenge is the design and its rationale. You need to design the tool, explain it, describe the kinds of tasks it is meant to address, and give rationale for why the design is good for the things it is meant to do. Implementing a prototype to show off your ideas is often a part of it.

The implementation is separate from the design. Building a program that actually provides a realization of the design that is robust and polished enough to read in data sets and allow users to really perform explorations is one way to show off a design. There are others. For example, students have protoyped interactive systems with videos of things like moving post-it notes around on a whiteboard. Often these convey the design well, and the rest of the documentation made it clear how it could be applied to different data, etc. There is a space of approaches in between. Sometimes there are simpler implementations that work on limited data sets or with limited robustness, but they make up for these limitations by having novelty and good explanations of what the actual design is (even if it isn’t well implemeted).

This flexibility creates a challenge for specifying the assignment and assessment. Different types of “implementations” require different amounts of efforts, and different skill sets. Writing polished interactive visualization systems takes a lot of effort - especially if you are not already fluent with the tools. A pencil and paper sketch can be made quickly.

A way to think about this is all assignments must excel in some way. If you want to turn in a “sketched by hand” assignment, that is possible. However, you will need to find other ways to excel: your designs will need to be extra interesting, your task analysis extra thorough, your example scenarios extra compelling, your descriptions extra clear, and/or your rationales extra compelling.

For any kind of assignment - programming or not - fluency with the tools (which is something you’ve gotten from outside of class) will make a big difference. If you’re an expert web programmer, or a skilled illustrator, you have an advantage. We will give some allowance for learning: we will ask you about your proficiency - and part of your project can be learning about the tools (we will rely on self-reporting). However, simply learning tools cannot fulfill the assignment: You can get some credit for learning new tools, but ultimately, you need to create visualization solutions that are well documented.

In an idea world, everyone would have a really creative idea for interesting visualizations and interactions, and build robust web-deployed systems that allow us to try out these ideas at scale on the real data. In practice, most assignments will only do some of these. We understand that not all assignments will excel at both design and implementation. If you don’t have a great implementation, you can make up for it with more clever design, thorough analysis, extensive reading, and/or great documentation.

## The Phases

There will be 5 phases to the project, each with a deadline (so, a deadline a week over the 5 week project). The deadlines might be shifted slightly. The dates listed are the earliest the deadlines will be. Don’t expect too much in the way of extensions: we are bound by the need to get grades done at the end of the semester. And we have the intermediate deadlines to make sure that everyone is making progress all along, and to give us some intermediate results for feedback.

Note that I am using the term “team” even though I expect that many teams will be a single student.

More details on each phase will be provided later. There will also be some specific aspects in the sub-problem descriptions.

• DC2 Phase 1: Problem Exploration (Design Exercise 8) (due Mon, Nov 15)

The main goal in phase 1 is to understand the problem. We expect each student to use this time to understand the problem, generate some example data, and examine what the baseline or naive solution might be (in order to know how to improve upon it). Specific goals will be given with each problem.

Each student must do this phase independently, but we will survey people’s partnering preferences.

If you decide that you want to switch problem after phase 1, you must find a partner who had done the problem for phase 1. For example, if you write your phase 1 about sub-group structure, but decide you prefer to work on tiny charts, you need to find a partner in phase 2 who did tiny charts for phase 1.

As part of this phase, we hope you will at least begin to start to identify relevant papers to read for ideas. (see reading below) While the initial posting to the reading discussion isn’t “due” until the next phase, we hope you at least start looking at readings as part of choosing which problem to work on.

• DC2 Phase 2: Proposal (Design Exercise 9)  (due Wed, Nov 24)

In phase 2, we ask students to describe what they plan to do for the project. This probably involves at least some initial design, but also involves choosing what kind of implementation project it will be (i.e., do you intend to do a mockup or a real working system? do you intend to focus on making a robust web app for a simpler design, or a simpler implementation of a fancier design? etc.)

During this phase, we will allow students to form teams (see below). We will have class events designed to help students identify partners. Teams must be formed before the proposal, and indicated in the proposal. Note: at least one member of a team must have done the problem for their phase 1. If you are proposing to work on tiny charts, at least one of the team members must have done phase 1 on tiny charts. Teams of 3 must be approved by the professor before submitting the phase-2 proposal.

During this phase, we will get some indication of the tools that you intend to use, and your proficiency with them.

After Phase 2, students may not change problem or teams, unless they discuss it with the instructor. We will only allow this in extreme circumstances.

Aligned with this phase, we will expect that everyone will be contributing to the reading discussions. The “due” date for the reading discussion is set to be aligned with this phase. But we hope people will be contributing throughout the entire project.

• DC2 Phase 3: Formulation (Design Exercise 10) (due Fri, Dec 3)

In phase 3, we expect teams to update their proposals based on their initial work. Hopefully, in this phase each project will have some initial designs that they can include in the updated proposals. Also, teams should have a better idea as to what they will really be able to achieve by the end.

Our intent is that these updated proposals will be shared with other teams for peer feedback. Peer feedback will be part of the project (but turned in separately).

• DC2 Phase 4: CANCELLED Initial Results (Design Exercise 11)  (due Fri, Dec 10)

In phase 4, we expect teams to give the course staff an update on their progress. In particular, we want to know what teams expect to turn in. This will allow us to adjust expectations, as well as to plan for grading. We expect to see some initial results, and a plan for how to evaluate your project (both self-assessment and staff assessment). We have cancelled phase 4 and will integrate it into the final handin.

• DC2 Phase 5: Final Handin (Design Exercise 12)  (due Wed, Dec 15)

Each team will need to turn in a project report and any artifacts. Artifacts include the program (everything we need to run it), any example data that you use for testing, any documentation you created, instructions for building and running the software, etc.

There will be a team handin (that includes the project report and any “artifacts”), as well as an individual hand-in (each student will need to provide a “self-assessment”).

We will give specific details on the report format and requirements, but it will include an “evaluation” section where you must assess how well your design and implementation work.

We will arrange for demonstration sessions for teams to show off their projects. Giving a demonstration will be optional, but we want to give teams a chance to show off their work.

Details on the mechanisms for turning in final projects will be provided closer to the deadline. Videos will be encouraged as part of the turn ins.

We also strongly encourage students to make their systems available on the web (e.g., via GitHub pages, or some service like Heroku).

## Working on Teams

Students are encouraged to work with a partner. Will will not require partnering (students may be teams of 1). We will allow teams of 3, but this must be approved by the professor. Send a Piazza posting before the phase 2 deadline with who your group is, which problem you intend to work on, and why you think you will be able to do a more ambitous project with 3 team members. Teams of size 3 must be approved by the Professor before the Phase 2 proposal. Teams of size 2 do not need any approval.

We will do things in class (both in person and online) to help students identify partners. There is a problem with class teams that it rewards students who happen to know more people in class (so they have good ideas who to work with). We will try strategies to mitigate this. My hope is that students who want to work on a team will be able to find a partner with similar interests. If you need help, let the course staff know!

In a sense, the entire class is a team: will encourage discussion among teams. While we like to see a diversity of projects, we also think that it is valuable for students to help each other and learn from collaboration. It’s not a contest: we really want every team to do something excellent.

Phase 1, the proficiencies part of Phase 2, and the self-reflection in Phase 5 must be done individually for each student.

All team members will receive the same grade for the main project phases. We will not try to attribute work to individuals. For every story of a “slacking partner” there is usually a story of a “lone ranger”.

The expectations for pairs will be slightly less than twice the individual expectations. However, there is an expectation that a project is integrated. Two independent projects stapled together will be assessed as worse than those projects as individual projects. However, there may be many ways to integrate work. For example, it could be one system that is more ambitous than could be expected from one person, or it could be two different ideas that are explored and compared.

For each phase, all students must complete the hand-in quiz. However, only one member of each team should turn in any artifacts or “real answers” to questions related to team work (beyond indicating that we should look at their teammate’s answers). The readings aspects of the project must also be done by every student.

## Interaction

Interaction and summarization are useful elements in all of these problems. (it’s a little tricky for tiny charts)

Implementing interaction well is hard. So, if you decide to do programming to build an interactive system, we will have realistic expectations.

However, it is possible for you to prototype an interactive design without actually programming an interactive system. You can simulate the system - as a series of screen shots / storyboard / comic book, showing the sequence of steps and explaining what the viewer’s actions. You can do this with a sequence of screen shots from your own program (it might not be really interactive), visualizations you made with some tool, sketches, …

Even if you do implement a system, you will need to describe it well. Unfortunately, we may not be able to run everyone’s program. Even arranging for demos will be hard (since it will be after the end of the semester). Make sure your document lets us appreciate what you have done. Videos are a good way to do this.

## Data

For most problems, we will provide some data - but all teams will be required to find data to test and illustrate their approaches.

## Programming

You do not have to program to do this assignment. If you want to turn in a “sketched by hand” assignment, that is possible. However, you will need to find other ways to excel: your designs will need to be extra interesting, your tasks extra compelling, your descriptions extra clear, and your rationales compelling. And it needs to be clear that your design is plausible (e.g., it doesn’t rely on data you don’t have or impossible analytics).

If you do program…

1. We may not be able to run your program. Your documentation must be complete and show off what the program can do. Be sure to describe things well and give pictures. You can even provide a video.
2. You can use any tools you like. We do not restrict you in terms of languages, libraries, etc - except that you must use tools that we legally have access to (do not use commercial libraries that require paid versions, unless the University has a license). You do need to tell us what you’ve used.
3. You do need to turn in everything we would need to run your program (in terms of the source code). However, we understand that we may not have the right environment to run it. Therefore, we may ask you to give a demo on your own computer (if your program requires a demonstration).
4. We will consider how expert you are with the tools you use. We’ll ask you in the proficiency assessment (phase 2), and as part of the self-assessment. However, there is a limit to how much value we will place on you learning tools. You can get some credit for learning new tools, but ultimately, you need to create visualizations that are well documented.

We have a strong preference for assignments that run on the web. We do not require it, but from past experience, it is highly correlated with positive responses from us. If you host an assignment on your CS home page, or as a GitHub page, or something else where you can just give us a link so we can try it out, that is great.

In an ideal world, everyone would have a really creative idea for interesting visualizations and interactions, and build robust web-deployed systems that allow us to try out these ideas at scale on the real data. In practice, most assignments will only do some of these. We understand that not all assignments will excel at both design and implementation. If you don’t have a great implementation, you can make up for it with more clever design and great documentation.

## Reading

As part of this assignment, you will need to identify and read some of the literature around the problem you are solving.

Our hope is for this to be collaborative: that everyone working on a problem will share the readings they find, and their thoughts on these readings. Therefore, we will have students report on their reading in discussions. We’ll try to do this as a Canvas discussion. If you identify a “new” paper, start a new thread. We encourage you to comment on the papers that you read - you might ask a question about something you didn’t understand, a recommendation to others about something you find useful (or not), a quick summary of a main idea so that your classmates might need to read it, etc.

As an experiment: we have created a Canvas Discussion: DC2 Reading Discussion (due Mon, Nov 22). Some things of note:

1. There is a due date: you must contribute at least something as part of Assignment Phase 2. However, we hope you will contribute before Phase 2, and continue to contribute afterwards.
2. There are separate groups for each sub-assignment. You should be able to sign up for whichever group you want to be part of (probably the problem you are working on). And you should be able to change groups if you change your mind. However, I have never used “self-signup” groups before, so I have no idea if this will work. Let’s hope for the best… Update 11/18: we manually placed students in groups since self-signup didn’t work.
3. We really want people to post “novel” papers that have not been posted before. However, we understand that you might get “scooped” - you thought you found a new paper, only to find that someone had posted it moments before. We will operate on the honor system: we prefer that you say “I found this paper independently” (as part of the thread for the paper), rather than starting a new thread. We don’t want this to be a race, but we do want to encourage people to find papers and share them.

We will evaluate your contributions to this discussion. We will consider if you identified papers, if you made meaningful comments on papers, etc.

We will seed the conversations with thread starters for the papers listed in the assignment web pages.

Doing this via an online discussion is an experiment. So if it goes horribly wrong, we’ll correct for that. Please at least try.

The expectations for reading can vary. If you are doing a more implementation intensive project, you might read less. If you are focusing on a novel design, you might want to read more so you can provide context for your design, and to identify inspiration.

When you post a new reading to start the thread, please give as much information as possible, including a link - but also relevant citation information. It can be useful for explain how you found it, and whether or not it was useful to you. You can help others by saying things like “I saw this paper with an interesting title, but it turns out its about this other topic…”

This does not just need to be “academic” articles: if you find useful web pages, blog posts, book chapters, etc. feel free to post those as well.

The course staff may also contribute to the reading discussions - beyond just the initial readings from the assignment.

## Grading

Grading is my least favorite part.

Do something cool that shows you’ve used your visualization knowledge to solve the problem, document it well, keep up with the deadlines, and you will be rewarded with an A.

As described on Grading, “A great project can pull your baseline grade up by a half-letter grade. A terrible project can pull it down.” This project has more weight than DC1. We will give you a final grade on an A-F scale that will consider your preliminary phase work. The preliminary phases may also be used in our “qualitative” judgment of your “Participation Grading”.

We will try to provide feedback in each phases about how we see the projects evolving, and how we think they might be assessed in the end. However, these estimates are often crude: we’ve seen great ideas in the initial phases be ruined by a failure to be written up well, or underwelming initial proposals turn into really cool final results. The Professor may grumble if you ask about grading, but you can get some feedback if you ask.

We like to reward creativity, thoroughness in reasoning and rationale (having good explanations for why you think what you’ve done solves the task), and showing off good application of visualization knowledge. Slick demos that are fun to look at, and visuals that look nice, generally fare well, even though these are often signs of someone’s skills from outside of class.

We will consider your contributions to the discussions as both part of your class participation grade and your Design Challenge grade. More details on expectations will be described elsewhere.

## Where to start

The first step is to choose which problem to work on: read the page for each.

In class, we will have some discussion of the problems, as well as some chances to meet classmates in order to find potential project partners.

Keep your eyes open for more details on the phases. Expect postings with details on each.

And, maybe most important, ask questions. Either on Piazza, during office hours, before or after class (sometimes these times can be a little rushed, but we will try to be more available). It’s hard for us to know what students don’t know. The assignment makes sense to me, and I know what I would try.

OK, if I were given this assignment I would have a hard time deciding between which problem to pick. But I have ideas for each one…

## More Details About the Phases

What to do for each phase…

For each problem there are some specifics for each phase (see the problem pages), this is to give you a general idea.

(and remember the reading requirements are separate)

### Phase 1

For phase 1, you must:

1. Be clear which problem you are chosing to work on
2. Identify a data set besides the ones we have provided (in the description and on the data sets page)
3. Discuss the tasks that you think a viewer would need to address with the data.
4. Identify problems in this example data, showing that you have examples of the problem to work on

Update 11/12: to be clear, identifying tasks (#3) should be general (not specific to a data set). #4 is more “dataset and scenario specific”. For example, with option 1, you might say a task (#3) is “find a subgroup that has no data” and (#4) might include specifics “when I looked at the IPUMS data, I found that there are no educated old people in the northeast”, or (#3) “downsize a line chart and still see the minimum and maximum precisely and (#4) “I made this line chart and downsized it in photoshop, and I can see these problems.”

Update 11/12: initially, I had hoped that there would be enough opportunity for students to work with real data as part of phase 1 (#4), looking through it manually to identify problems. This makes most sense with the subgroup option, but is in general may not be practical. The real objective is to “experience the problem”. So, you should sketch out what kinds of things you might look for (or expect to see).

Update for future: the prompt doesn’t require you to present on idea of what the solution is - just to identify the problem.

### Phase 2

For phase 2, you must:

1. There will be a “proficiency assessment” (some questions on the quiz)
2. You must tell us who you team is. If your team is 3 people, this must be approved before submitting the proposal.
3. We will give details about the form of proposals. Considering what tasks you will address will be a big part.
4. You will submit a project proposal as a PDF.

Update 11/12: we will give more details on the “format” of the proposal We gave details on the proposal in the Phase 2 section.

#### The Project Proposal Format

Update 11/16: Proposal format added (in expanding box below)

Project Proposal Format

You will turn in your proposal as a PDF as part of the hand-in survey. There is no fixed format. You must have the names of all teammates on the proposal.

Generally, we want your proposal to describe (1) what you are trying to achieve, (2) what your “solution” will be (in design and form), (3) how you will create that, and (4) how will you evaluate what you have done. A good proposal makes it clear that you have thought about all of these things.

The proposal is a plan - not a contract. We understand that plans change. In particular, the specifics of your design will probably change as you work on it. The proposal should give a sense of the kind of solution you expect - if you make something “like” the thing you describe, then you will have succeeded.

The amount of detail you provide can vary. The important thing is to make sure to convey that you have thought through all aspects (the problem, the solution, the process, the evaluation).

While there is no fixed format, breaking it into sections according to the 4 parts (or subparts) may be a good starting point.

• Problem: be more specific than the assignment specification. What tasks are you going to focus on? (although, additional task analysis might be part of the problem) What kinds of data? Do you have examples of things that you want to make sure you can do?

• Solution: it might be helpful to divide this into two parts: design and implementation.

• Design: what will your solution be like? This might be a sketch or a storyboard - or even a more rough description. Trying to create a sketch in detail can help you work out whether the idea is good. At this point, the important thing is not to have “the” design, but at least to show you’ve thought about it enough to have a sense of what the solution will be, and the confidence that you’ll be able to come up with something. You might not have one complete design: you might have a bunch of different possible strategies, or a rough sense with notions of the pieces you still need to figure out.

• Implementation: even if your project is destined to be a design study focusing sketches and rationale, there is still a notion of “what will you actually make.” Describing the form of what you will create (a set of hand-drawn sketches, an interactive web application, etc) is important. More details are helpful. What tools are you going to use? Do you have a sense of how to deploy it?

• Plan: what will you actually do? I strongly recommend making a timeline - how do you expect to get the things done in the amount of time you have (which isn’t very much). Be sure to leave time for “writing things up.” Good plans often identify risks and have fallbacks.

• Evaluation: how will you decide that your project is successful? What will you want to show? Your proposal must address the evaluation aspect.

### Phase 3

For phase 3, you must:

1. Describe the progress you have made. There should have been some progress since the proposal. It might be some initial designs, some initial implementation experiments, etc.
2. Describe any updates to your proposal. You can turn in an updated proposal
3. In this phase, being specific about the tasks you are focusing on will be important.

These will be submitted as PDFs.

### Phase 4

For this phase, we will ask you to turn in something that shows us your progress, and will give the course staff a sense of what you will have at the end. Screen shots and initial results are useful, sketches of initial designs, etc. The more that you can tell us about what you expect to turn in, the better we can help make sure it is aligned with expectations. Often with ambitous projects, we can help narrow things down.

We have cancelled phase 4.

### Phase 5

For phase 5, we will ask that you turn in your “artifacts” (program or other designs), a “report” (describing everything), and individual “self-reflections”.

## History

Mainly for my own reference, the historical projects