Getting Started (TL;DR)

by Mike Gleicher on August 22, 2017

There is a pile of stuff on the course web for you to look at. Enough that it might be hard to find everything. Here’s a summary of the key points, with links…

List of Pages and Posts to begin with…

  1. Course Announcement (general info)
  2. What is this class and why?
  3. Syllabus (class info – not the detailed schedule)
  4. Course Policies (page, stuff could be on the syllabus)
  5. Books for the class
  6. Changes for Fall 2017
  7. Rules for Seek and Find Assignments and Online Discussions
  8. The class schedule isn’t up yet (as of 8/15)
  9. How to Visualize – this will be a reading for week 1, but it might give you a better sense of my philosophy.

How to Vis

by Mike Gleicher on August 22, 2017

How to Visualize

This document serves three purposes – that are not independent:

  1. It will give you my philosophy about how to do visualization
  2. It will give you a sense of why the Vis class is the way that it is – and suggest what you might expect to learn over the course of the semester.
  3. If you aren’t going to take the class, it will give you a sense of what I might do if I try to help you with your visualization problems, or what you should try to do if you go at it on your own.

You might think of this as the whole semester condensed into a single blog posting.

This is adapted from my usual first lecture in class – except that it doesn’t have as many fun example pictures. However, since lots of people don’t get to see that lecture, this tries to get the main point. And even if you do get to see the lecture, it’s a reminder of the steps of how to do visualization.

I was going to title this posting “All I ever really needed to know (to do Visualization) I (could have) learned in Tamara Munzner’s Nested Model Paper” – as a play on the title of the book “All I Ever Really Needed to Know I Learned in Kindergarten,” but I haven’t actually read that book, and the humor is lost if you haven’t heard of it. And explaining how the philosophy of the nested model morphs into my process below is a story not worth telling.

This document is an update of my 2015 “How To Do Visualization” posting. It also has a bit of redundancy with the “What is this class and Why” post.

What is Viusalization Anyway?

It’s surprisingly hard to define visualization, but probably not important.
The more interesting question is to define what is a good visualization.

Here’s a rough definition of visualization that is purposefully broad, but surprisingly good enough:

Visualization: a picture(1) that helps someone(2) do something(3).

The picture part is hard actually, since a visualization may not be a picture in a traditional sense – it can be anything you look at. For example:

Physical Bar Chart

Lego Tree Map

A physical object that you look at can be a visualization (like the blocks of snow or lego model). Or a visualization could be an animation, or some interactive things. You might argue that we should relax the “look at” and bring other senses to bear (e.g., auralization to communicate data via sounds). However, while there are similarities between vision and other senses, there are enough differences that I think its best to focus on visual things (things we see) for this discussion (and the class).

The important part of the definition is that it helps someone do something. What makes a picture a visualization is a sense of purpose: it’s going to be used for something.

The more specific term “data visualization” is challenging since almost any visualization has some data in it. I’m not going to try to define “data visualization”.

Example: Architectural visualization is often not considered data visualization. However, there is “data” (information about the building being visualized). It is a visualization since it is a picture (or an interactive system that generates pictures) that help someone do something, in this case an architect or their client get a sense of what a designed building will look like before it is actually built.

A good thing about my definition of visualization is that it focuses on this sense of “task” – the picture is meant to do something, so we should think about what it is trying to do to make sure it really can help someone do the thing its meant to do.

The definition doesn’t necessary say that the visualization succeeds at helping someone do something. We can certainly have bad visualizations that don’t help. Effective visualizations are pictures that really do help their intended audience achieve the task.

Making visualizations isn’t hard. Making good visualizations is hard.

Aside: In Munzner’s book, she defines visualization as being “designed to be effective.” In my mind, she is defining good visualizations – bad visualizations might not be effective, or might not be designed.

The goal of this document/class is to teach you how to design/create good visualizations. With the emphasis on the “good” part – making bad visualizations doesn’t have to be hard, and is probably not worth the effort.

There are a few messages in this:

  1. A core of this class will be understanding what makes for a good visualization, and what we can do to design them.
  2. Figuring out what good visualization to make (designing it) is important, we don’t want to waste our time implementing bad visualizations.
  3. Understanding the principles and process of visualization can help us figure out what visualizations will be good before we invest too much energy in making them.
  4. Generating ideas for visualization and making sure they are good (and will lead to good designs when they are fully implemented) is my preferred approach. Finding ways to “prototype” ideas so we can assess them before investing too much energy is important.
  5. Implementating the design once you have it is not a focus in this class. It is a detail. A sometimes challenging detail. And it is definitely a practical concern: a great design isn’t of much value if you can’t make it real.

Implementations can take many forms. I’m not going to suggest you make snow sculptures (like the pictures above), but maybe prototyping with Legos (picture above) is a way to try things out. Your choice of implementation strategy is almost always dictated by practical issues (where you need to show your visualization, what tools are available, …). The appropriate tools change quickly. The principles of choosing what to make with them do not.

A side effect of this is that we are not going to focus on programming. In fact, later in this document, I’ll argue that the goal should be to avoid programming. If you can get by with existing tools, you should.

What are good visualizations?

To make a good visualization, we need to decide what a good visualization is. And then we can consider a process to make them.

Defining “good” visualizations will be a major topic in this class. Evaluation considers how we decide if a visualization is good or not. At a high level, the definition of visualization provides an answer:

A good visualization is one that effectively serves it intended purpose (helping the audience do the thing the visualization was meant to help them do).

Exactly how to measure whether a visualization does what it needs to do is more challenging, and is a topic we’ll come back to.

One important and useful technique to assess a visualization (or just about anything) is critique. Critique is the “standard” design practice of looking at something carefully and discussing it. Critique is a really useful assessment approach because we can apply it to existing designs (e.g. created by others) to learn from them, or our own ideas. It can be applied to finished designs, or rough ideas. Learning to critique is a valuable skill for all design – and it’s not something that it typically focused on in CS education.

Critique will be a key technique in this class.
Teaching students to do critique (via lots of practice) is a key component of the class.

Note that a good visualization doesn’t have to been fancy – it has to be effective / get the job done. In fact, using a standard design is often desirable: you don’t need to teach people how to use a new design, and you can probably find an existing implementation.

Here’s my favorite analogy. You go to the doctor’s office because you feel sick. The last thing you want to hear is “that’s a novel and interesting problem! we need to devise a novel treatment. let’s write a grant proposal and hire some research assistants…” No, you want to hear “I’ve seen that before. No problem. Take two aspirin and see me in the morning.”

As visualization practitioners, our goal is to be able to look at a problem and make those kinds of prescriptions. The task identification and abstraction are key here. It’s how we can say “I’ve seen that before” and get to “take two scatterplots and see me in the morning.”

How to make a good Visualization?

Here is my three step recipe:
1. Why are you making this visualization? Who are you trying to help? What are you trying to help them do? I refer to the latter as the “task” – and it’s usually more important than the who part.
2. What data are you trying to use to achieve this task?
3. How are you going to use the data to help achieve the task?

I split question 3 into two parts. There’s a planning part, and a part where you realize that plan. Which leads to the four step recipe.

  1. Task
  2. Data / Resources
  3. Design
  4. Details

In the ideal world, you start at the top, and work your way down through the list.
The steps are iterative: at the end of each step (ideally) you do some evaluation (e.g., critique) and maybe go back to a previous step.

Sometimes the steps don’t happen in order. For example, you really want to use a particular tool, try out a new algorithm, or make things a particular color, so you go looking for something to make with these details.

Sometimes, the process seems to start with #2 (Data): one gets some data and needs to figure out what to do with it. But this is actually an initial task: find what is interesting in the data. Often there is an iterative cycle – as the designer understands the data more, they can refine the task.

In a little more detail

  1. Task – understand what the purpose of the visualization. Who is it meant to help? What is it meant to help them do?
  2. Data – what resources are available to help achieve the task? The main thing is data.
  3. Design – what is the strategy for mapping the data into something visual?
  4. Details – how will you make this strategy into a specific picture / system that produces pictures? What are the specific choices (e.g., colors, implementation, …)

You may notice that this parallels Tamara Munzner’s nested model for validation. (It’s discussed in her book, but was a great paper first) I think in terms of visualization design, not just validation (but evaluation is so important to design that it might not matter), so I changed the layers a bit.

If all goes according to plan, you’ll understand these 4 steps in the first few weeks of class.

How do we think about tasks and data?

Visualizations help someone do something for some reason. (who, what why).

The better that you understand what the visualization is trying to achieve (what will it help the person do), the more likely you will come up with a good solution. In the end, everything serves the tasks.

Note the plural: you may have a set of tasks. Often, there isn’t just one at a time. There are a set of things that a set of someones may want to do for a set of reasons. And maybe your solution will address many of these.

I was going to say “it starts with the tasks,” but sometimes you start someplace else (like you have some data and say “I’d like to do something with it” – but even then, I would probably say you have a task: figure out what the right questions to ask are!). However, in those cases, it’s really important to remember that task is key: the sooner you get to “what is this thing going to do for someone,” the better off you are.

This is also not to say that you need to fully understand the task at the beginning. Sometimes, your understanding of the task is hazy, or changes as you learn more (from later stages).

Task is an informal, fuzzy notion. It doesn’t always get explicitly written down or defined. But the clearer you are about it, the better off everything else will be. You can’t succeed unless you have something to succeed at.

One other detail on task: there is a range of kinds of tasks. There are abstract tasks and concrete application tasks. This is actually a spectrum/continuum.

While task is the most central thing, it’s also hard to talk about. We lack good, rigorous ways to talk about it. For the longest time, it meant that it didn’t get discussed enough (in the literature, in my class, in my work, …). The fact that it is hard shouldn’t get in the way of us trying to get better at thinking about it. We particularly lack good ways to talk about different levels of task abstraction.
Where I start…

When I talk to a new (potential) domain collaborator, I always start with the the question “tell me about your science.” I want to know the big picture (the why) – because without it, it’s hard to have context.

My first goal is to identify the problem that needs to be solved – it won’t help anyone if we solve the wrong problem.

Usually people come thinking they want specific help – they want to start with the data, or worse, with the way they are looking at their data (can you make a better chart for me? not without understanding what you are trying to do, so I know what “better” means!) We will get to that, but I think its important to identify the task.

I’ll stress this: if you want to be a visualization scientist (or more generally, a data scientist or computer scientist), one of the best skills you can have is to be able to help people identify their problems. I think it’s hard for people to identify their problems. Part of this is that people get so caught up in the details, that they lose sight of the big picture. Or that they are so set in how they do things that they lose the ability to imagine alternatives.

And, as computer scientists (and/or mathematicians), we have a secret weapon: abstraction. This is something that we value/stress much more than other disciplines. For this task phase of visualization, abstraction is a key tool. If we can recognize the abstract task for which the real problem is an instance of, the path to solving it becomes much clearer.

How do we make a design?

A design is the plan for how you are going to turn the data into a “picture” that helps with the task. This is why it’s so important to understand task and data before trying to make a design.

One you know your task and your data, you can try to design a solution. I say “design” to explicitly separate the act of coming up with the idea and actually building it (implementation). Design is the act of making conscious choices to solve a problem. (Defining design is a whole philosophical debate – but that definition is one I like, and will work with for the moment)

In terms of the class, a big part of what we’ll do is focus on design. What are the choices you can make, and how can you make good choices.

There are four main categories of things that we consider in designing a visualization. You can think of these as the kinds of choices you can make, or the kinds of building blocks you can build a visualization out of. I sometimes think of these like moves in a turn-based game, at each step I pick one of these things to either add (or change, if I am doing redesign).

  1. Data Transformations – we compute some derived thing about the data that will be useful in one of the other steps
  2. Layout – we decide where things go. Technically, this is a position encoding (see encodings below), but position is such an important thing, it gets it’s own special category.
  3. Encodings – an encoding is how we choose to map a data variable to some “visual variable” (an attribute of what we see – like color). Position is a visual variable, but it’s special enough that it becomes its own category (see layout).
  4. Interaction – taking user input is another thing you can do in a visualization. Often, input can be thought of as mapping input actions to changes in the visualization.

For a simple example of applying these four steps see “A Simple Example: 4 Design Moves.”

Almost everything we do in designing a visualization turns out to be making one of those 4 kinds of choices. Almost every visualization can be thought of in terms of these 4 building blocks.

I find this list to be a useful way to organize the larger list of more specific things you might do. Most things fit into one category or another. I won’t waste time arguing this is the best categorization – but it’s good enough to give you a sense of the kinds of things that you can think about.

We’ll learn how to choose these different components, and use them together. We will look at visualizations and try to understand them in terms of these four components. We’ll think about redesigning visualizations by changing the choices. We’ll try to develop a sense of how to map tasks and user goals onto these kinds of choices.

How do we make good choices for design?

Creating a visualization is about making those choices for a design so that the result is effective for the task… but how can you choose wisely?

Part of it is trial and error. Sorry. But, this is why we emphasize prototype and critique so much.

But there are things we can use that can hopefully help us make better choices. Some examples (which are, of course, things we’ll study in class):

  • Principles of Design – General ideas on how to make things that are “nice” visually and communicate effectively. These principles are the same if you’re designing a visualization, a web page, your resume, … – so they are good principles to learn!
  • Principles of Visualization – Over time, people in the field have gotten some ideas about what works and what doesn’t. Sometimes, this folklore is made up and may not be true. Other times, it comes from experience or has been proven by experiments.
  • Principles of Perception – Understanding how people see (as in how the visual system works and how the brain interprets images) provides a lot of useful clues as to what designs will (and won’t) work.
  • Examples – Looking at existing examples – both good and bad – can help us. Sometimes, we can gain intuitions so we can make new designs. Other times, standard solutions provide us with answers, or at least a starting point.

But what about implementation?

Actually realizing the design is the last part. Well, not really, since usually the process of making a visualization is iterative: once you make something, you learn from it, and refine some of your earlier work, and try again.

If you were thinking “this is a CS class, we should focus on implementation,” you will be disappointed. As I’ve said, this class is more about how to figure out what the right picture to make is (e.g. the design) than how to make it. It’s a waste of energy to spend time making the wrong picture.

In the ideal world, you can think about implementation last – it’s an afterthought. In practice, the constraints of having to implement things will probably influence the kinds of designs you will want to consider. A design becomes less attractive if its too hard to build. In practice, there’s often a tradeoff between the practical issues of implementation and having the best design.

Even within implementation, there is a spectrum of levels. I like to think of this as “fidelity of prototypes.” In a sense, you can think of a back-of-the-napkin sketch as an implementation of a design. Most likely an incomplete, non-final one, but an concrete instantiation. It might be a good enough implementation that you can evaluate your design and decide if you want to pursue the design further (and make a higher-fidelity prototype). If you’re lucky, a crude prototype might just solve the actual problem.

One thing I like to stress is the importance of prototyping to explore designs. It’s best to try out lots of ideas, and see if you can figure out their problems before investing a lot in implementing them. Good “Designers” (graphic designers, industrial designers, …) usually like to explore an entire space of designs – by using very crude “implementations” (e.g. sketches).

Data analysis tools – things like Excel (yes, excel will turn out to be my favorite visualization tools) or Tableau or … – often let you prototype lots of different things with your data. This “playing” with data – re-ordering it, making various kinds of pictures with it, looking at it all kinds of different ways – is actually a form of rapid prototyping. You can explore a lot of designs easily – often to decide that they don’t solve your problem – but sometimes to see that some of the simple elements actually can help. This “playing with data” (if you can do it) is a lot like sketching a lot of visual designs.

Having a good toolbox so that you can implement your designs is useful. If you don’t have one, you will be limited in what designs you can explore, and won’t be able to choose designs that you can’t realize (that’s not quite true: if you can come up with a great design, you may be able to get someone else to implement it). Part of my premise for this class (or at least this instantiation of it) is that we can all have different toolboxes – some students might be wizard programmers, some might be fabulous artists – but we all can have some common basic tools (e.g. sketching), and we can all explore designs using out respective toolboxes.

Now, if you’re saying “but I want visualization to be about writing fancy programs using complex data analysis methods and algorithms and spiffy programming things …” let me give you a bit of caution.

Building a custom visualization solution by programming should be a last resort. You should really believe that your problem cannot be solved by some easier method. Going back to the medical analogy, writing a program for a new design is like inventing a completely new (and therefore untested) treatment. Yes, if your patient has a mysterious disease and is going to die you want to take these drastic measures. Or, you might do an experiment if you believe that you can afford the risk on this patient in order to learn something to save the next ones (this is the excuse we use as researchers).

That said, all too often there are other factors that make us want to take the extreme measure. Sometimes, we just want to practice our inventive skills. Sometimes our “customers” think they want to have something novel (don’t make it look too easy!). Sometimes we really want to try out some implementation idea, or show off some challenging design idea. And sometimes, it might just be easier to re-implement a standard design than to figure out how to make an “easy” tool do what we want. (you’d be amazed how often I’ve found myself writing Python code for scatterplots because I wasn’t in the mood to wrestle with Excel). Sometimes, it’s hard to find a decent “easy” tool for something that should be easy (like graph layout).

So what do we do?

After that, you can guess what the topics of the class should be. But, to be explicit, here’s a list of what we did in the Spring of 2017.

  • (2017:2) Understand What is Visualization
    We will try to get a better sense of the broad range of what visualization is.
  • (2017:3) Understand why to use visualization
    This gets at that notion of task, and sets up for the notion of evaluation
  • (2017:4) Discuss strategies for Evaluation
    There are many different ways to assess if a visualization is good. And since good visualization is our goal, knowing how to measure “good” will be important.
  • (2017:4?) Critique Skills
    Critique is one method for evaluation – that we will use extensively throughout class.
    I believe that it is best learned by practice, so we will do it a lot. But at the beginning of class we’ll take some time to develop critique skills.
  • Design School
    I can’t teach 4 years of an art degree into one lecture. But we can’t try to get to be a little better at doing design.
  • (2017:5) Understand Data and Task Abstractions
    Abstraction is the key way that we use to talk about tasks and data.
  • (2017:?) Visualization Principles
    There are some basic good ideas
  • (2017:6) Understanding Encodings and Standard Designs
    Encodings are the basic building blocks of visualizations.
    We’ll use these as a way to look at how the standard designs can be broken apart.
  • (2017:7) Understanding Perception
    A good source of principles and design ideas is from the science of perception.
    We’ll try to learn a bit about how we see, and what this might mean in terms of the design of visualizations
  • (2017:8) Color
    How we see color, and what this means in terms of how to use color effectively in visualization will turn out to be a big topic.
  • (2017:9) Interaction
    Interaction is a key tool in creating effective visualizations. We’ll try to understand how to use it effectively. Interaction is (often) tricky to implement and prototype.
  • (2017:11) Implementation
    We’ll talk about basic strategies for implementation – types of tools and how to choose. We’ll talk about some specific tools as examples, but more in terms of understanding the kinds of tools and toolkits (and how to choose between them) than the specifics of any particular tool.
  • (2017:?) Implementation: Specific Tools in Depth
    I hate to spend much time on any specific tool.
    But inevitably, you may want to make a visualization with something more than a sketch, so getting exposed to some common tools is helpful. Even if you choose to use different ones.
  • (2017:12) Multi-Variate
    We’ll look at common strategies for the challenging cases of multi-variate data.
  • (2017:13) Dealing with Scale
    Having “too much data” is a common reason why visualization becomes hard.
    We’ll look at some common approaches to deal with it.
  • (2017:10) Graph and Network Data
    Graphs (in the CS/Math sense of a network of connected things) are an common kind of data, and offer some important challenges.
  • (2017:15) Scientific Visualization
    There are some common types of data that come up in science and engineering applications.
  • (2017:15) 3D
  • Presentations – sadly, we usually run out of time before getting to this
  • Animation – sadly, we usually run out of time before getting to this

A Simple Example: 4 Design Moves

by Mike Gleicher on August 22, 2017

If you haven’t seen my “how to visualize” post, you might want to start there. Also, I usually do this example in lecture, so may skip it if you’ve seen it before.

I like to think of visualizations as being made up of four components:

  1. Data Transformations
  2. Layouts
  3. Encodings
  4. Interactions

Here, I give a simple example of how those get used to solve visualization problems. The key points are to give examples of those components, and to introduce the idea of thinking of the components as “redesign” choices that can be made to improve a design so that it better addresses a task. A side effect is that it shows that you can do useful visualization stuff with simple tools (this is all in excel).

The Task: I need to look at the grade distribution for my class and get a sense if it’s reasonable / fair. (more tasks may emerge as I look at the data)

The Data/Resources: I have a table of student names (not shown), scores and grades. It’s in an excel spreadsheet – so I’d prefer to keep things there. I don’t have very much development resources (this is often done at the grading deadline).

The Design: I have chosen to use a table, since it’s easy in excel. And it may be good enough (especially with design tweaks).

I am going to describe this as a “game” in the sense of something where we have a “current state” and at each “turn” I get to choose a “move.” The moves are to pick one of the 4 design components and change it.

I’ll start with an initial design – the first thing that comes to mind. Just look at the table in Excel the way the data comes to me.

 

This table has 57 rows (there were 57 students in the class). The last column is the grade I am going to give them, which is created by rounding the “Average” column (which is why the grade column is called “Rounded”).

For my task, this table isn’t great – I want to know things like “Am I giving enough As? Are too many people failing? What’s the median?” So, I need to improve my design.

Move 1: Layout – For my first design move, I’m going to choose to change the layout (the positions of the elements). Right now I’m using the position on the Y axis to tell me student number. This is useful if I need to find a specific student, but less good for seeing the grade distribution. So, I will change the layout: I’ll sort by the average.

An aside: this could also be seen as “interaction” – I want my vis to do two things, so I make it easy for the user to click a button (the column sort buttons in excel) to switch between the two different things we may want to do. A joy of excel, when you’re good at it, it makes these kinds of interactions easy.

After applying move 1, notice that the resulting vis is much better – I can quickly get a sense of how many As there are, how many failures, what the median is, …

Move 2: Encode – In the big table it’s hard to see the individual grades (if I shrink it to fit on my laptop screen). So, I’ll apply a color encoding – using excel’s color range feature – on the rounded grades.

Now, I think I’m starting to win the game. I can really quickly see the proportion of each grade this distribution is giving – even in the thumbail! It’s pretty clear that there aren’t many As, but there aren’t many failures either.

But, it does raise a new question task: are there students who are being hurt by the rounding? This is somewhat easy to check, since I can look at people near the borders. But I can apply a visualization design move….

Move 3: Transform Data – I can make a new “derived” column that is the difference between the rounded grade and the average. This directly measures what rounding does to people. And to skip a step, I can color code it: I’ll use a red/blue diverging scale (you’ll learn about these later, but they are built into excel). That way, red means someone hurt by rounding, blue is someone helped.

Now I think I’ve won this round of the visualization game. To do my task of getting a sense of who is most hurt by rounding, I can look for dark red – if I want to make it even easier, I can sort by that column! (use interaction)

And note: the reason I consider this “winning” is that I am able to do the tasks I need to do (get a sense of the grade distribution, see who is hurt by rounding). The visualization “game” is ultimately about tasks.

Lessons

Hopefully, you now have a sense of what the 4 design moves are. And you’ve seen how we can change our choices to make a visualization be more effective for a task. Along the way, you may have gotten a sense of how understanding task helps. And how effective designs can be simple – and done with existing tools.

Note that the details of the implementation didn’t matter. I could have done this with a different spreadsheet program, or if I had more time, I could have written a Javascript program. In fact the details are really irrelevant: there’s no use telling you which buttons in Excel to push, since this was done with an old version of Excel (the features are still there – it’s just different to get at them).

On-line Discussion Assignments

by Mike Gleicher on August 12, 2017

Each week there will be an On-Line Discussion Assignment.

Due: There will be an online discussion due each week of the class (15 in total). The initial postingis due on Tuesdays. Additional postings (including required discussion) are due on/by Friday. The assignments will remain open until the following Friday to allow for additional discussions and late handins. The first week there is a little extra leniency in the due dates; the last week there is a little less.

Late Policy: Late assignments will be accepted, subject to the class Late Policy. Turning something in late is better than turning in nothing (but you can’t turn things in after the assignment closes). At the end of the semester, we will look at your consistency: if you were a few hours late once or twice, it won’t matter. If you were consistently late, you may be penalized.

Assessment: Assignments will be scored on the 3/2/1/0 scale. A good assignment will include meaningful initial postings that answer the prompted questions and the required amount of discussion. If you don’t include all the “requirements” (in terms of the number of postings and topics) or your responses don’t fit the question (e.g., your initial posting) your assignment may be marked incomplete.

We will make subjective assessments of your postings (including the discussion) which will be used to decide between border grades.

Turning it in: There will be a “discussion” on Canvas that you can post to for each assignment.

Groups: Because it’s too hard to have a conversation with the whole class, the class will be broken up into random groups for each assignment. Once we get to steady state for the semester in terms of enrollment, we’ll probably hold the groups constant so people can get to know each other.

Learning Objectives (why are you doing this?): The primary goal of these assignments are to get you to think about the material in the readings and lecture, by forcing you to answer questions and have a conversation about it with your classmates. A secondary goal is to provide a check on whether people are understanding the readings (this is someone of a “self-evaluation” – if you can’t answer the question, you probably haven’t read carefully enough). Peer assessment here is informal, but in practice it seems to work. You may also want to look at the learning goals for the particular week.

What you need to do:

Start by doing the readings (unless an assignment says to postpone readings)

With each assignment, there will be 1-2 “starting postings” – these are postings that you make that start new threads, and are independent of what others say. Your first one you’ll have to post before you are allowed to look at other people’s postings.

With each assignment you are required to discuss the answers (after you’ve made your original postings) with others in your group. While it’s artificial to quantify a minimum, there are minimum numbers of discussion postings (usually, you will need to make a minimum of 3 discussion postings per assignment).

Very terse postings may not count fully – saying “good answer, I agree” to someone is a good thing to do, but the expectation that at least some of your discussion will say more than that.

Admittedly, the “course staff” doesn’t have the ability to carefully read and evaluate all discussion postings. We hope that people will take it seriously enough (and peer pressure will help) that this will be a valuable exercise.  As a graduate student (this is a graduate class), you shouldn’t just be doing things for the grade. You will learn more if you take this seriously.

Seek and Find Assignment Rules

by Mike Gleicher on August 12, 2017

Each week, there will be a seek and find assignment.

Due: There will be a seek and find each week of the class (15 in total). They are due on Fridays. The assignments will remain open for one week after the due date to allow for late assignments and for discussion. The first week there is a little extra leniency in the due dates; the last week there is a little less.

Assessment: Assignments will be scored on the 3/2/1/0 scale. A good assignment will include a valid picture and link, and a brief answer to the question. If you don’t include all the “requirements” (a picture, a link, and answer) or your responses don’t fit the question (e.g., the visualization you picked doesn’t meet the requirements) your assignment may be marked incomplete.

While discussion of seek and finds is not required, it is recommended. At the end of the semester, when we try to give grades, we will look at the quantity and quality of discussion contributions.

Turning it in: There will be a “discussion” on Canvas that you can post to for each assignment. Make sure to include your picture and a link, and your answer to the question. Looking at other people’s postings (and discussing them) is recommended, but not required.

Late Policy: Late assignments will be accepted, subject to the class Late Policy. Turning something in late is better than turning in nothing (but you can’t turn things in after the assignment closes). At the end of the semester, we will look at your consistency: if you were a few hours late once or twice, it won’t matter. If you were consistently late, you may be penalized.

Learning Objectives (why are you doing this): We want you to see how the concepts we discuss in class appear in the “real world.” We want you to see real examples and see how visualization, and visualization concepts are done. We want you to have examples to discuss and critique.

What you need to do:

For this assignment you must bring us a …. (data) visualization!

(sorry, this is a reference to an old Monty Python movie – if you don’t know the reference, that line won’t be funny. Even if you do know the reference, it might not be funny).

Each week, we will ask you to bring us a visualization (we will have these seek and find assignments every week). There will usually be some specification of what you need to find. We might ask for a certain kind of data, or an example of the use of a specific technique.

The seek and find ground rules.

  • It cannot be a visualization that you (or someone in class) made.
  • It must be publicly available.
  • You must be able to provide an image.
  • If it’s on a web page, you should copy a picture (either use a screen shot or copy the image). Please shrink the image to a reasonable size, if it’s too small for people to see the detail, they’ll be able to get it from the link you give.  Post your image on Canvas following these instructions.
  • Try to find something interesting (to you at least)
  • There may be other rules added

Create a posting and include a picture of the visualization. If you found the visualization on the web, provide a link to the page that it is on (if it’s hard to find on that page, give some clues like “on page 4 of https://graphics.cs.wisc.edu/Papers/2015/AG15/Submission-FINAL-7-27-2015.pdf“). If you scanned it or photograph it, describe where you got it from (scanned from p7 of January 6th Capital Times).

Try to pick something that you don’t think anyone else will pick. Even though you can peek and see what others are posting, someone might post at the same time as you, so try to avoid redundancy although this isn’t a strict rule.

You are welcome to discuss other people’s submissions (you are allowed to comment on canvas). Discussion is not required. However, students often find it interesting to look at what everyone else has turned in, and to discuss by replying to the postings.

Because a discussion with 50+ people can become unwieldy, we divide the class in half.

What is this class and why?

by Mike Gleicher on August 11, 2017

What is this class and why? (or, “will you like this class?”)

I want to be upfront about what this class is. I want to get your expectations in the right place. If you’re expecting something else, you may be disappointed. And I think it helps you appreciate why I am teaching the way that I am teaching.

From the announcement:

This class is more about what pictures to make to understand data than how to make them. We will spend a lot of time understanding design principles. We will not spend lots of time talking about how to program visualizations, or how to use tools to make visualizations.

Or, to explain it another way:

Making visualizations isn’t hard. Making good visualizations is hard. So we need to understand what makes for a good visualization before we waste our time trying to make visualizations. This class is about understanding good visualizations.

Why focus on “Visualization Principles” not implementation?

The principles of good visualization apply for everyone. For each person, the appropriate tools and development process may be different.

The principles of good visualization are constant and unchanging (although, our understanding of them is improving). The tools for creating visualizations change continually.

The skills for thinking about visualization principles (design, critique, task-oriented analysis, abstraction, …) are generally useful for many things. The skills for creating visualizations are pretty specific.

Over the years, I think I’ve learned to teach people the principles. For for many of the implementation skills (e.g., web development), many of you either already know a lot more than me (i.e., you are up-to-date web developers) or aren’t at a place on the learning curve where a class like this will help (i.e., you need to learn basic web programming skills first).

Why focus on Basics/Foundations and not Fancy Stuff?

First, a lot of the best cutting edge research is building a better understanding of the foundations.

Second, is that really understanding the foundations is the basis for doing fancy stuff.

Third, and maybe most important, effectively using the basic stuff and foundations is usually what you need. Fancy stuff should be a last resort!

Visualization’s goal is to solve people’s problems. Sometimes, that requires inventing a novel and complicated visualization. Other times, it might mean applying some simple, off-the-shelf solution.

Here’s my favorite analogy. You go to the doctor’s office because you feel sick. The last thing you want to hear is “that’s a novel and interesting problem! we need to devise a novel treatment. let’s write a grant proposal and hire some research assistants…” No, you want to hear “I’ve seen that before. No problem. Take two aspirin and see me in the morning.”

As visualization practitioners, our goal is to be able to look at a problem and make those kinds of prescriptions. The foundations (e.g., task and data abstractions) are key here. It’s how we can say “I’ve seen that before” and get to “take two scatterplots and see me in the morning.”

A standard design (like a scatterplot or line chart) can be really effective in many situations. And, if a standard design can be effective there are lots of good reasons to prefer a standard design. For example, they are familiar to the viewer and you probably don’t need to reinvent the implementation. The key is to be able to identify when a standard design is effective and how to use it appropriately. You’ll need to make similar choices in inventing a new, fancy design.

But, I really want to learn about (programming, using Tableau, …)!

This class is, admittedly, not for everyone.

This class will teach you how to design good visualizations, and will teach you a little about about the choices you have in how to make them. If you learn how to use some tools for making visualizations (Tableau, D3, R, matplotlib, …) you will know what to make with those tools!

There are good resources for learning about specific tools. And in class, you may meet others who are also trying to learn these tools as well.

The focus on the non-technical elements (design, design methodologies, …) makes this different than the standard CS class.

But, I don’t care about general principles – I just want to visualize my data!

Sorry, we generally don’t let students “bring their own data” to class, for a number of reasons.

The principles you will learn will help you work with your own data – but for learning in class, it’s best that we work on data that everyone has access to, and that we believe is the right level of challenge.

If we’re not going to program, what are we going to do?

Because this year’s schedule isn’t together yet, I recommend that you look at the course web for the last offering (especially the schedule), it will give you a good sense of the topics we’ll cover. This year some things will change, but the basic structure and main content will be the same. You may also want to look at the “How to Visualize” post which summarizes the course content.

In terms of what we’ll do this semester…

  • We’ll do a bunch of reading and discussing to understand principles.
  • We’ll focus on learning critique skills to help us learn by analyzing visualizations.
  • We’ll spend a lot of time examining examples to understand why they work (and why they don’t). This will include spending time looking for examples to see how visualization is out there in the world.
  • We’ll spend time learning the building blocks that visualizations are built from.
  • We’ll spend time learning about visual perception and how it influences visualization design.
  • We’ll do some design exercises to try to appreciate how those principles get put into practice.
  • We’ll look at some specialized techniques and designs for dealing with particularly tricky kinds of data.
  • We’ll do some larger design challenges to practice applying the principles.
  • And if you want (it will be optional), you can do some programming for some of those design challenges.

But I want even more…

If you want more than what is in this post, you might go to the post How to Visualize that gives the basic ideas of visualization (and the class) in a little more depth.

If you want to go beyond what we do in class, (e.g., learn about some of the more mathematical and implementation aspects, get involved in research, …) we can find ways for you to do it. But that is in addition to class stuff.

Course Policies and Administration

by Mike Gleicher on August 10, 2017

Much of the information about Course Policies is on the Course Policies page.

That is slightly confusing since there is this “administrative” topic which includes more postings with details about things.

Administrative postings (below) expand upon what the main Course Policies page has to say.

New for Fall 2017

by Mike Gleicher on August 10, 2017

CS765 was offered in Spring of 2017. The basic structure of the class was good, but there were a number of things that left room for improvement. This Fall 2017 edition of CS765 will try to fix some of those things.

I am posting this list partially as notes to myself (to remind myself what I am doing), but also so you can see that I really do listen to students, and know what to ignore from previous course offererings.

The grading scheme was a mess

Figuring out how to grade this class is messy no matter what. This time, I am just admitting that I have no good way to evaluate online discussions, and am not even going to try to give meaningful scoring.

Worse, last time, I did gather quantitative statistics on online discussions – which upset people because they though that was the evaluation criteria. I find the quantitative info interesting, but I didn’t want to imply that how much you wrote mattered more than what you wrote. (that said, my informal examination is the two were well correlated).

The web presence was a mess

Someone pointed out to me that a student had to use 4 different web systems (the web page, Canvas, Piazza, and Box for the reader).

I only found out at the end of the semester that the scheme we had for pushing notifications was only notifying me. So no one was getting notifications.

Part of the problem is that we need the different web systems – they are each good for different things. This semester, I will try to focus more on using Canvas, since while it isn’t the easiest to keep organized, it is a central place for many things.

We had a irregular schedule at the start

It took a while to get the class into a regular rhythm. After I while, I found the right rhythm. This year, we will a similar scheme right from week 1.

Consistency in Assignments is important

Related to the irregular start, the regular assignments need to be regular (and they will).

One trick will be to properly mix in the “irregular” parts of class (the design challenges) so that there aren’t weeks of craziness mixed in with weeks of boredom.

I missed the first week

I won’t do that again. The first week is important, for many reasons. It sets the tone for class, is an important time to get to know people, etc. I will miss class for some conferences, but I won’t miss the first week.

We had a terrible room (for this class – it’s probably good for other things)

It seemed like a good idea – a “collaboration classroom” where students sit around tables and can work together. But what it meant was a room that put me in the center, with everyone’s back to me. It meant lectures were terrible, I didn’t get to know students, and traditional lecture interactions didn’t work well.

This year we are in a more traditional classroom. It limits the class size (for better or worse), but lectures will go better. In class exercises might be tougher…

Students weren’t getting the basics in a concrete way

A lot of material was presented abstractly, and I went through the concrete manifestations very quickly. I will try to put more energy into making these foundations (like encodings) concrete, using in class exercises and other pedagogical designs.

Because I really cannot check the online discussions (and I don’t want to give quizzes), I have less of a sense of how well students are grasping the concepts. This semester, I will try to have more class exercises to help students self-assess.

Design comes into class too late

Part of the argument is that you shouldn’t make stuff until you know what to make. But I want to get more design in earlier, since making stuff is important to many people (and is fun).

The Importance of Critique wasn’t made clear

We did a lot of critique – but I didn’t explain why it is so critical to doing vis (and learning about vis). This year I will be clearer about why we focus so much on critique and try to make it a more regular activity.

The overall plan for class wasn’t emphasized

I have a strong opinion on how you should learn about visualization. This class implements that plan, and provides (what I believe is) a good foundation for doing all kinds of vis work as a user, a tool builder, or a researcher. There is a method to the madness – the steps really do follow.

However, I am not sure that I communicated my strategy for why we are doing things the way we are doing them. It was communicated briefly, in an early class. I will try to better emphasize my philosophy and come back to it. I’ll also try to communicate it in writing.

Books for the class

by Mike Gleicher on August 4, 2017

You are not required to purchase any textbooks for this class.

There will be extensive (for a CS class) readings in this class, however, all of them will be provided online.

There will be several books that we use in class. We will provide you with access to the parts you need online, but if you like to have physical books, you may want to purchase them. In particular, some of the books we will only give you parts of.

This is a list of the books we’ll look at parts of in class – so you can decide if you want to buy one or more of them, and to give some context for when I ask you to read a chapter out of the middle of a book. If you’d rather just be surprised when I point you at a chapter online, read no farther…

All of the required chapters, and a few optional extras, will be available online on Canvas in the Course files section. There’s also a page I made with the complete list.

In this list, I’ll provide links to the books on Amazon. This is not an endorsement of Amazon (see my 2015 rant about online bookstores). They are convenient, and they have pages with good information about the books.

Here are the “main” books of the class, with some commentary on the books so you can decide whether you want to invest in physical copies:

  • Tamara Munzer’s Visualization Analysis and Design. (2015 Discussion) (Amazon purchase)
    This is a “graduate level computer science” textbook – that shares a similar philosophy to how I like to think about (and teach) visualization. It’s really good at giving you a way to think
    about visualization, and some examples of how the ideas are applied. It is of limited use as a reference book, and it doesn’t talk about practical issues at all.
    Over the course of the semester we will read almost all chapters of Munzer’s book.
    You can access the Munzer book online through UW library here.
  • Colin Ware’s Visual Thinking for Design.  (2015 Discussion) (Amazon purchase)
    This is a thin little book (I know several people who read it in one sitting) that discussion the psychology of visual perception and its relationship to visualization and design. It’s not very deep, but its a great place to get started in appreciating how understanding how we see can help us be better designers.
    You can access the Ware book online through the UW library here.

Alberto Cairo‘s two books on visualization are both great. If you wanted one book to “teach you about Vis” I’d probably recommend one of them. For various reasons, they aren’t perfect for class texts. Since not all of the books are relevant to the class, I don’t feel right requiring them. I will provide a few chapters of each over the course of the semester. (academic fair use allows me to provide a small number of chapters of a book). If you’re interested in Vis, I suspect you will like Cairo’s books and want to buy one (and you’ll see what I’m talking about from the chapters you read).

In the Spring 2017 version of class, I gave a chapter-by-chapter breakdown of the two books. If you want to buy one book, I’d pick “The Truthful Art” (Amazon link). In fact, if you want to buy one book on visualization, this might by my current favorite.

Eduard Tufte‘s Books on Visualization are probably the most famous books on the subject. They are more art history books, full of historical examples and commentary, then books to help you understand or design visualizations. (see my 2015 discussion). If you are working in the field, you will probably want to own a set (and he’ll sell you a complete set of the 4 books for a reasonable amount of money). Read Cairo first. (Chapter 3 of “The Functional Art” has a great discussion of Tufte). We’ll look at a few chapters from Tufte over the course of the semester.

Books on Graphic Design: You will probably want to read something about graphic design. You could do a whole graduate degree (and more) on the subject, so no one book is going to do it justice. This is a topic you’ll probably want to read more of than we get to do in class.

I really like Robin Williams’ “The Non-Designer’s Design Book” (Amazon) as a quick dose – an hour lesson that helped me a lot (and helps many people a lot). A single hour lesson isn’t going to make anyone a designer, but it’s more useful than you can possible imagine such a small dose being. We’ll actually do the small dose in class (including having you read some of Williams’ book).

I haven’t finished reading “Design for Hackers: Reverse Engineering Beauty” yet (Amazon), but in principle I like what it is trying to do (get the main points across to geeks like me). The book covers the right topics, but I haven’t decided if I like how it covers it yet. It is available online through the UW library, so you can look it over to see if you like it or not.

Other Books: There are a few other books that we’ll pull a chapter from. I cannot recommend that you buy any of them. None of them have enough useful information in them.

None of the books have very much “practical” stuff (like how to program visualizations, or how to use some tool). These kinds of topics change quickly, so books tend to be out of date by the time I find them. Also, the best resources for these kinds of things tend to be online.

Design Challenge 3: Build a Visualization!

by Mike Gleicher on August 2, 2017

This is a stub / placeholder. Details of this assignment are coming soon.