The Week in Vis: Week 9 (Oct 30-Nov 3)

by gleicherapi on October 28, 2017

Week 9 (Mon, Oct 30-Fri, Nov 3) – Color

While the intention for last week was to talk about human perception, we instead took class time to discuss experiments. And we spent a day with a design exercise to think about many aspects of abstract problems, tasks, designs, evaluations, etc.

This week is “color week.” We’ll also talk about other aspects of perception (that we didn’t get to last week). Depending on how things go, we may do an in-class design exercise on Wednesday, or we might continue to discuss color and save the ICE for the following week.

Hopefully, you are well into Design Challenge 2. We’ll try to get DC1 feedback to you this week.

Learning Goals (for this week)

  1. Understand color from the physics, perceptual, display, and semantics points of view.
  2. Understand how different aspects of color (perception, physics, semantics) limit what color can and can’t be used for in visualization.
  3. Appreciate the design of color ramps, especially the issues with rainbow ramps and the utility of “brewer style” ramps.
  4. Understand how to choose good color mappings and use color as an effective encoding.

The Week in Vis: Week 8 (Oct 23-Oct 27)

by gleicherapi on October 22, 2017

Week 8 (Mon, Oct 23-Fri, Oct 27) – Perception

Last week, we talked about evaluation (and Tufte), and looked at a design problem as a way to think about design choices, evaluation, and comparison. And you (supposedly) wrapped up DC1. Many of you came and talked about DC2.

This week, we’ll move on to the study of human perception, and how it can influence Vis. We’ll just be touching the tip of the iceberg – since perception is a huge topic, and it can have wide-ranging impact on how we do visualization and design.

Part of the interest in perception for vis is the empirical methodology – which can be applied more broadly. I am not sure how deep we’ll be able to get into the design of experiments.

Design Challenge 2 also starts – beware that the first phase has a hard deadline.

This week, there will be no class on Friday. I’ll have an extra office hour in the class time slot (11-noon) if you want to come by to talk about things Vis or class related.

Learning Goals (for this week)

  1. Have enough of the basics of how human perception works to appreciate how it can impact visualization design.
  2. Understand key phenomena, such as popout, grouping, and aggregation, and see how they can be applied in visualization design.
  3. Have awareness of key perceptual and cognitive limitations, and how these can impact the success of visualization design.
  4. Appreciate how perceptual science methodologies can provide guidelines for visualization design.
  5. (probably not time to get to it) Appreciate how the empirical approaches of perception research can be adapted to visualization.

DC2 is posted…

by Mike Gleicher on October 19, 2017

Design challenge 2 has been posted (it is here) – so you can look it over and think about getting started once DC1 is over.

The assignment still says “draft” since I am sure there are some issues to work out – if there is something that isn’t clear, let me know and I’ll try to clarify the assignment. If you can’t figure it out, probably no one else can either.

This assignment is quite different than any that I’ve given in the past, but I think it really gets at a lot of the topics we’ve been discussing.

Design Challenge 2: Spaghetti Plots

by Mike Gleicher on October 17, 2017

“Spaghetti plot” is a pejorative term for a multi-line line graph that has too many line for people to see everything in them. Despite their problems, they are ubiquitous. In this design challenge we will try to understand them, figure out what they are good and bad for, and (hopefully) come up with something better for the places where they do not work.

Changes in BLUE

11/2/2017 – Additional Info: A list of abstract tasks compiled from peoples’ phase 1 lists is available on Canvas. If you want to get more ideas than the ones you generated, you can look at it.

11/1/2017 – Clarification: for phase 2, you can use the abstract tasks you turned in in phase 1. If you you want to do tasks defined by others, you can pick those as well. We will try to get a list of tasks compiled from student phase 1s out quickly so you can get more ideas beyond what you did in phase 1.

11/1/2017 – Schedule Change: we are extending the final deadline from November 19th to November 26th, although (1) the deadline will be hard (we won’t accept assignments more than a few days late), (2) you still need to turn in a “draft” for November 19th so we can get a sense of what you are turning in. I didn’t want to make an assignment due over Thanksgiving, but think of it as due before Thanksgiving, with a grace period. More details in blue below.

 

In a nutshell…

  • October 29 – Situations and Tasks – You will turn in a list of tasks and situations where this applies.
  • November 5 – Analyses – You will turn in a more complete analysis, focusing on the existing designs. You’ll provide a more complete list of tasks along with examples of data. You should identify the problems that you intend to address in the later phases.
  • November 12 – Designs – You will turn in designs for solutions, as well as describing what else you will do (make a tool, provide data sets, provide experimental designs, …)
  • November 19 – Drafts – You will turn in a draft of your assignment so that we can get a sense of what to expect. This deadline is fairly firm.
  • November 19 26 – Final Submissions – You will turn in what you final product is. This deadline is fairly firm.

A warning: don’t view this as 4 1 week assignments. There’s a lot to do at the end, so you probably want to be working towards it over the course of the weeks.

Background

This assignment is about multi-line, line graph data. But a multi-line line graph is just a visualization of the particular kind of data that we’re interested in. But, it’s easier to say that than “data sets with a nominal/categorical set of items each with a quantitative (interval/ratio) dimension where at each sample there is a quantitative (interval/ratio) value.

To make our lives a little easier…

  1. There is a set (potentially a nominal/categorical set – but it might be ordered) of “objects” – these are the things that we have a line for. (we have N objects)
  2. Each line covers a range (for purposes of discussion, let’s call it “time” – even though the dimension can be any time). For this assignment, we can assume that each line has the same start and end “time”. Even though this is a continuous dimension, we’ll assume we have a uniformly spaced set of samples (so we can simply refer to them by integers), without any gaps. (we have M samples)
  3. At any “time” (for any one of the samples on the line), the line has a “value” – which is a quantitative value within some range.
  4. We’ll ignore the fact that the “time” dimension is time (since it isn’t always). This means that there aren’t “obvious” cycles that we need to account for (like seasons, or day/night).

For example…

  • We may have climate data. For 50 cities around the world (N=50), over the course of 3 years, we have measured the temperature each day (M=365*3). For each of those 50*365*3 observations we have a temperature in degrees Celsius (in the range -20 to 50).
  • We may have sales data. For 100 products (N=100), over the course of 10 years, each month (M=10*12) we have the number of items sold.
  • We may have noise data on a train. For each of the 12 cars on a train (N=12), over the course of the 100km route (measured every km) (M=100), we have a measure of the noise level.

Notice that there are 3 types of scale we need to contend with:

  1. The number of “lines” (N)
  2. The number of samples of each line (M)
  3. The dynamic range of the values (if the values are over a very wide range they can be harder to show)

If all three of those are small (N,M, range), then it’s easy – you can use a multi-line graph. But as N, M and/or the range grows, the problem gets trickier. And that’s where the assignment comes in.

There are three obvious designs:

  1. Spaghetti Plot: showing multiple lines in one graph – usually differentiating the lines with different colors. (it’s called a spaghetti plot because with many lines it becomes a tangled mess). It’s a position encoding for each of the quantitative dimensions, and color (or some property of the line) for the categorical one.
  2. Small Multiples: showing individual line graphs – probably making each one small, which is why it’s called “small multiples”. With many lines, it becomes big (or has very tiny lines). This uses a position encoding for all 3 dimensions (we use position to encode which line).
  3. Encoding the value with color, rather than height, so that each line is a strip of color. A stack of these strips of color has been called a “Lasagna Plot” (see resources) – I’m not sure if this term has caught on, but I like it and will use it. This uses position for the “time” dimension and the item, but uses color for the value.

By “visual design” I am referring to encoding. There may be other encodings (can you think of some? that’s part of the challenge here).

Here are those three designs – generated by the simplest program that I could write, all using the same “fake” data. (you can check out my simple implementation. (described below). You can also try out Florian’s D3 example.

For each encoding, there are lots of minor variants. You can add interaction to highlight on element in a spaghetti plot; you can scroll and filter to select within small multiples; you can change the colorings of a lasagna plot, …

Given how common this kind of data is, you’d expect there to be good solutions. Or at least a well-characterized space of design decisions that gives guidance of how to make informed choices. But I am not aware of any in the literature. So, we have to work it out ourselves in this assignment.

The structure of the assignment (or why is this so complicated?)

There is a broad range of skills and interests in the class. I want to keep my promise of “Programming Optional” – and even for people who want to program, there is a wide variance – I don’t want this just to measure how much experience programming you’ve had before this class.

The main objective of the assignment are for us to use a very common and standard chart type to explore visualization principles and design. It’s a real problem that lots of people have, and I don’t think there is consensus in the literature on how to address these challenges. In fact, I don’t think there is much literature on the problem. People just do the basic stuff.

However, I want to give people who want to learn about implementation a chance to try it out. Or, if you are a good programmer, you might think it’s easier to explore a design idea with code. Or if you’re interested in experiment design, you might want to do some of that. Or…

So, I am giving you a lot of choices. I may regret it. And I make no premise that the hardness / amount of effort will be balanced. Your assignment must excel in at least one area – and you can choose which one it is. However, you must do at least some of each part of the assignment.

  • Task / Design Analysis – You must think through what people do with this kind of data, and what they want from their visualization tools. Then you will look at the available visualizations and see how well they work for the different designs (both the already known ones, and new ones you will create).
  • Novel Designs – You need to come up with solutions to some specific use cases (or general cases) with “new designs.” This might be some radically new visual encoding, or maybe just some tweaks on an existing design (like coming up with a clever use of interaction).
  • Implementation – You will need to show off your designs. This might be building an actual system that lets people try it out. It might be a set of hand-drawn sketches. There are lots of things in-between. Some notable categories (and these aren’t hard distinctions). Also the names are somewhat arbitrary.
    • Tools – are programs that can read in new data sets. They will let a user (or a grader) try out your designs with their own data.
    • Prototypes – are visualizations that are created with real data (like a tool), but only for a fixed set of data. For example, if you write a program that only works with a few data sets, or manually draw a picture from a data set. If the grader can’t try their own data set, then it’s not a tool – but a prototype.
    • Sketches – don’t use real data. They approximate data, or use fake data. Usually you do this if you’re drawing something by hand, but you can imagine a program that draws a picture of what a visualization would look like without actually computing it from data.
  • Evaluation – You need to show that you know how to evaluate designs. This includes critique (which overlaps with analysis above). But it also means thinking about how you might design an experiment to better understand how well a design works. Unfortunately, you probably won’t be able to run the experiment – but we want you to at least think about what the experiment should be.

The parts of the assignment

Note: the different parts of the assignment connect. So when you’re doing an earlier part, think ahead that you will be using these results as the basis for the latter parts.

For parts 1 and 2, each person must work alone and submit their own work. For parts 3 and 4, you may work with a partner (rules below).

Part 1: Task Analysis

You need to come up with lists of:

  • 3-5 concrete situations where this kind of data comes up. I gave 3 above (city/temp, sales, train noise) – don’t pick those. Describe how these problems might scale (will they get large in N,M, or range?). If you can identify real, publically available data sets, that’s great (but not required).
  • 7-10 tasks. Describe them both in terms of a specific situation but also in a more abstract way (examples below).

I’ll give you a few to start with… you cannot use these in your list (for part 1, you can use them in your list for part 2). I’ll describe them in terms of the cities/days example above (these are the “concrete” examples).

  • On which day was the greatest range in temperatures seen?
  • What city had the widest range of temperatures?
  • Was there a month in which a city had its temperature rise consistently?

For abstract descriptions, these might be:

  • Identify the sample (time) with the greatest range in value.
  • Identify the line with the greatest range in value.
  • Identify a consistent increasing trend for a line within a time range.

Note: there is a hard cutoff for this part. After the cutoff date, we will share lists – so you can take ideas from others for your parts 2-4.

You will turn in this assignment as 2 lists as a Canvas assignment.

Note: keep the future phases in mind as you do this.

Part 2: Critique

In this part, you need to pick 3-5 different “tasks” and critique each of the three basic designs [Spaghetti, SmallMultiples, Lasagna]. For each task, explain why you think each design may or may not be appropriate (consider how they might scale as N, M, or range scales). You may wish to sketch out what something might look like to better explain the pros and cons.

Remember, that a critique isn’t just to say what’s wrong – it’s also to say what’s right. Hopefully, you can identify some things the basic designs are good for as well as some things they are not. You will need to have some situations where the basic designs aren’t good for the later parts – where you will need to come up with something better.

You will turn these in via Canvas.

Note: even though phases 3 and 4 are in the future, you probably want to start on them as part of your work for phase 2. Also, as part of part 2 you should say if you plan to work with a partner.

Part 3: Design

Pick at least 1-3 scenarios where none of the existing designs seem to work well. (by scenario, I mean a combination of task and data scale / data properties). Come up with some alternative designs that might plausibly provide a better solution than one of the 3 basic designs. (note: expectations for pairs are higher)

You need to come up with at least 2 designs. Depending on what you do with the designs (in part 4), you may need to come up with more. If you’re implementing/testing designs then you can do fewer; if you’re assignment is more of a “design study” you might want to generate a larger number of alternatives. (note: expectations for pairs are higher)

It is harder to assess what is a “different” design or a “different” scenario. It’s less of a clear cut “these are different” and more of a continuum – from minor tweak to completely different. Try to come up with things that are different. Remember, you have the “4 design moves” to try (transform data, change encoding, change layout, add interaction).

You need to give rationale for your design as to why you think it might be a better alternative for the scenario than the basic designs. At this point, you’re just checking that your design is plausible – it could be that when you implement/evaluate that you find out that it’s not actually a good idea. But at this point, you should be able to at least give an argument for why you think your design is good enough to be worth exploring.

For this phase, what you must turn in are descriptions/sketches of your designs (which includes your rationale for them), and a description of what you plan to do with them (i.e., for phase 4).

In a sense, this is a “rough draft” of your final handin (phase 4). Your “sketches” might be screenshots of your implementation (which hopefully you’ve started, if you are doing one).

Part 4: Implement and Evaluate

For the designs you came up with, you need to follow through to help determine if the proposed design actually addresses the scenario that you identified. There’s a wide range of what you might do here – and it’s hard to compare. You might do a lot of very detailed sketching and critique, or you might do a lot of tool-building and less analysis, or you might some simple implementation and lots of thought into evaluation or …

But, each assignment must include some implementation and evaluation – for broad definitions of what those mean. Expectations clearly vary based on what you do: tool building is time consuming, so you may do less of other things (including having complex designs) – if you’re “just” sketching designs, we would expect you to excel in other ways. This makes evaluating the assignments really difficult – and planning ahead for it hard.

I’ve seen broad ranges of assignments (for similar assignments in the past): “programmed implementations” range from the minimal amount of code to get a basic graph up using a standard Python or R library to complete systems that someone would actually want to use. I’ve seen some amazing “sketches” – like one where a student showed the interactions in their design as a storyboard made as a series of photographs of a white board with post-it notes moving around.

We will provide some basic implementations of the basic designs. You can use these as a starting point for your own implementations (be sure to give proper attribution!). You can use these to make pictures that you draw on to make sketches of fancies designs. You can use these to see how bad the basic designs are for the problem you want to solve (assuming you create some test data). You can ignore them.

If you choose to program… we don’t care what tools you use. You must be able to give us a demo – so make sure it runs on your laptop. We may or may not be able to run things (even if you use tools that we have, there may be version issues or setup issues or …). That said, you must give us everything we would need if we wanted to try to run your program – including a list of things we need (e.g. libraries and other setup) and instructions. You also need to give us pictures of what your program looks like when it runs.

If you choose not to program, we’re still curious how you made stuff. Usually we can figure it out. But if you drew pictures with a drawing program, or generated things using some tool like tableau and embellished it or … let us know.

We expect that you will implement and evaluate at least 2 designs (more for partners). Of course, if you have a really fancy design that handles a lot of scenarios, and you do a lot of evaluation of how this one design addresses all those scenarios.

Implementation – you must do something that lets us clearly see what the designs looks like, and get a sense of how it would address the problems that you set out to address. As described above, this might be anything from building a nice system that lets us load in our own test data sets, to some hand-drawn sketches that give the ideas. It will be important for you to be clear about what your ambitions are.

Evaluation – you must do something to assess whether your design is good, and/or describe a plan to do so. Note: this is about evaluating the design, not the implementation. There are a number of things you might do (and this may not be a complete list):

  1. Critique – explain why your design is good (or not). Analyze how good it is for the specific scenario it’s designed for – and compare it to other designs. Consider how it scales to harder or bigger data. Consider how it applies to other tasks. Consider how it compares to the basic designs.
  2. Examples / Use Cases – give examples of your design working on test data that typifies the kinds of tasks you are trying to solve. You may want to make test data where you know the answer so that you can show that your design is effective (and try out other designs).
  3. Test Data – provide examples of testing data for the task where the task is appropriate, there is a right answer, and there are varying levels of difficulty. This is a place where programming may be useful (writing a sample data generator). Of course, finding real examples can be more interesting. These are useful for #2 and #4
  4. Experiment – provide a design for an experiment to assess your design and compare it to a baseline. What model tasks? What measurements? What data? How will you control for the various factors that come up?
    Note: that you probably cannot run the experiment (since you don’t have an IRB protocol), but you can give a design for an experiment at varying levels of detail.

For Part 4, you’ll need to turn in “everything” – some documentation describing what you did, your designs, your evaluation. Any pictures and implementations. Even if you turn in code, you need to turn in enough pictures of what things look like that we can assess it without having to run it. Part 4 is a superset of Part 3, so your handin needs to include a description of your designs and your rationale.

If you turn things in on time by the draft deadline, we will check over what you’ve turned in and try to give you some initial feedback, along with the opportunity to make small fixes (without being penalized for being late). If you do not turn things in by the deadline, you will miss out on this opportunity. The final deadline will also be fairly firm.

Requirements / What you’ll turn in / Assessment

There is a deadline for each phase. You will receive a score for each of the initial phases, and an overall grade (there is no separate phase 4 grade – the scores for the initial phases will be factored in to create an overall A-F grade for the whole thing).

Phase 1: Due October 29th (hard cutoff October 31st) – you will turn in your lists of tasks and concrete situations. Your task list should describe each task concretely and abstractly. Upload your lists to the Canvas Assignment. (graded Acceptable/Unacceptable/Not Turned In). Note: Phase 1 must be done individually (even if you do Phase 3 with a partner).

Phase 2: Due November 5 (hard cutoff November 10th) – you will turn in your analysis of tasks and how the basic designs apply. While this will be graded Acceptable/Unacceptable/Not Turned In, we will also note truly exceptional assignments and reward them in the final score. Upload a document to the Canvas Assignment. Note: Phase 2 must be done individually (even if you do Phase 3 with a partner).

Phase 3: Due November 12 (hard cutoff November 17th) – you will turn in you designs and their rationales. Upload a document to the Canvas Assignment. This will be graded Acceptable/Unacceptable/Not Turned In. Really great designs will be rewarded since we’ll see them again in Phase 4. You may do Phase 3 (and 4) with a partner (see rules below).

Phase 4: Due November 19. Turn in everything. If you turn in things on time, we will try to give an initial check and offer you some opportunity to fix things.

November 19th: turn in a draft of your assignment. This is so we can get a sense of what you will have for your final. If you want this to be your final, that’s OK. We will try to give people some feedback (no promises).

November 26th: turn in the final project / everything. The deadline is fairly tight as we need to start grading very shortly after it. We will accept late assignments until we start to grade (which will probably be a week or so later). “Everything” should include a document that explains what all the pieces are. It’s probably easiest to turn things in as a big zip file. It should be clear what you’ve done, what your designs are, what tasks you’re trying to address, …

(if you prefer to think of it this way: … The deadline is still the 19th, we’re just giving a clearer and more lenient late policy, with the ability to “resubmit” if you want to update your assignment).

We may schedule in-person demos to let people who built things show-off their implementations, or even for people who just want to explain their submissions in more detail. These will happen after Thanksgiving. They may happen at a Friday class time.

If you don’t think Canvas is appropriate for turning in your assignment for phase 4, please contact the instructor to make alternative plan. But for most people, uploading to the Canvas Assignment should work.

We won’t give a specific grade for Phase 4 – you will get an overall grade for the assignment that factors in what you did in the initial phases. Note that for each of the phases that you miss, there is a mandatory penalty – you cannot get an A for the assignment if you do not turn in all the parts – including the draft.

If you work with a partner, both of you will get the same grade for Phases 3 and 4. Your overall grade may be different based on Phases 1 and 2. If you work with a partner, only one person needs to turn in phases 3 and 4. However, the partner who isn’t turning in the assignment must still put something into Canvas (in the text submission) saying who their partner is, and that they are turning things in.

Some thoughts on grading…

Grading will be subjective and arbitary 🙂

This assignment gives students a lot of freedom to choose how to excel. Excellence will be rewarded with a good grade. And there are many ways to excel. We also appreciate that there are tradeoffs – if you put a ton of implementation effort into making a robust tool that anyone can use with their own data, we understand that your designs might not be as far out, or you may have fewer designs. If you come up with a wide range of really creative designs for different tasks with really detailed sketches and rationales, maybe you won’t have as detailed of a evaluation plan – or any programming at all.

However: all assignments must do some of all parts. You must come up with tasks/situations, create designs that assess them, communicate them (either as a program or pictures), give rationales for them, and have some sense of how to evaluate them.

Ground Rules and Starting Points

We will provide sample data as CSV files. In these files, the lines will be in columns, and each “time step” will be in a row. The first column is an X value. You can ignore it (they will always be consecutive), but you may want to use it to properly label the axes.

We will provide example solutions for you to try. They will be simple implementations of the basic designs. You may use them as starting points for your implementations, or simply to get a sense of what the basic designs look like on different data sets.

Be sure to give proper attribution for any pieces that you “borrow” to start off with. It is OK to take our initial implementations – or the D3 examples – as starting points, but you must identify this and give proper attribution.

I made the most brain-dead simple implementation of the basic designs in Python using matplotlib. It looks terrible, and only does the simplest thing. But it can give you a starting point. You can get it on GitHub.

I have made some sample data for you to start with – it’s random noise (some of it is more structured so there’s something to look at). The code to generate it is in my python repo, but you can get pre-computed sample files on GitHub.

Florian wrote a basic implementation using D3. (github reporunning online version).  This also serves as a starting point. But you can also look for the D3 multi-line line graph example.

Working with a Partner

We will allow you to work with a partner for this assignment. But only for phases 3 and 4. Some ground rules:

  1. You must send email to the instructor on or before November 5th (the day Phase 2 is due). You should mention your partner in your Phase 2 submission (even though your Phase 2 submissions should be independent). We recommend thinking ahead about Phase 3 while your working on Phase 2 (it’s OK to talk about Phase 2 with others, but we expect everyone’s work to be somewhat independent).
  2. Both partners need to agree to work with each other. (I may not check)
  3. Once you’ve decided to work together, you are committed to working together. If you don’t like working together, you can each turn in separate assignments – but we’ll only grade one.
  4. Both partners will get the same scores for Phase 3 and Phase 4 – we won’t try to attribute the work to one person or the other.
  5. You only have to turn in 1 “writeup”. But in every other way, the expectations are higher. We expect approximately twice as many scenarios to be considered and designs to be generated. We expect twice as many designs to be considered in Phase 4. But really, it’s twice as much stuff – you might have fewer designs done well (or a bunch of very similar designs). But hopefully, with two people thinking about it you can come up with more wildly different designs.

Some Other Resources

  1. The paper on climate model comparison gives a great example of coming up with a clever design for a spaghetti plot by really thinking through what their users need.
    Dasgupta, A., Poco, J., Wei, Y., Cook, R., Bertini, E., & Silva, C. T. (2015). Bridging Theory with Practice: An Exploratory Study of Visualization Use and Design for Climate Model Comparison. IEEE Transactions on Visualization and Computer Graphics, 21(9), 996–1014. http://doi.org/10.1109/TVCG.2015.2413774
  2. My paper on comparison may give you ideas on how to think about the problem in a structured way – for defining tasks and considering design alternatives.
  3. The original paper on Lasagna Plots (where the name comes from):

    Swihart, B. J., Caffo, B., James, B. D., Strand, M., Schwartz, B. S., & Punjabi, N. M. (2010). Lasagna plots: a saucy alternative to spaghetti plots. Epidemiology (Cambridge, Mass.), 21(5), 621–5. http://doi.org/10.1097/EDE.0b013e3181e5b06a

  4. Our paper on tasks for different designs for line graph data (trying to get at what colors are good for and not)

    Albers, D., Correll, M., & Gleicher, M. (2014). Task-Driven Evaluation of Aggregation in Time Series Visualization. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems CHI Conference, 2014, 551–560. http://doi.org/10.1145/2556288.2557200

 

DC1 – Rough Drafts Feedback

by Mike Gleicher on October 17, 2017

Everyone who turned in something for the rough drafts got full score (4/4). In some cases, I noted some feedback that will hopefully help you improve what you do for the final thing. Here are the 4 comments that got made (these may apply to you even if I didn’t leave this comment):

  1. Try to make visualizations that are multi-variate. A single barchart, piechart or scatterplot can tell a story, but isn’t really very multi-variate. Even a chloropleth map has limited multi-variateness.
  2. Think about the clarity of the individual designs.
  3. If you’re making a comparison, think about how to choose a design that will make that comparison easiest for the viewer.
  4. It’s not clear why the data fits the design.

Because I didn’t have the writeups, and because I didn’t have much time to explore each one, and because they aren’t finished, it’s not clear when these different things apply – so if you think I gave feedback that was wrong, it might be – the feedback here is just a suggestion.

But for everyone… we’ll be looking for those 4 things when we do the grading on the final things. As the assignment says, we are looking for things that are multi-variate, have good design choices, and are interesting.

The Week in Vis: Week 7 (Oct 16-Oct 20)

by gleicherapi on October 13, 2017

Week 7 (Mon, Oct 16-Fri, Oct 20) – Evaluation

Last week, we talked about implementation. And hopefully you made progress on Design Challenge 1.

This week, we’ll talk about evaluation. In Monday’s lecture, we’ll talk about many different kinds of evaluation. In class on Wednesday, we’ll do an In-Class Exercise that will help prepare for Design Challenge 2 (yes, it’s coming – immediately after DC1!).

On Friday, there will be an optional class where I will go over the ideas of DC2 – kindof a preview (so I can tune the assignment if necessary). I’ll also use this as a time to answer any extra DC1 questions.

Learning Goals (for this week)

  1. Appreciate the different approaches to visualization evaluation.
  2. Understand the nested model and its implications for how to choose evaluation strategies
  3. Appreciate “rules of thumb” based evaluation, both its benefits and limitations
  4. Practice critique
  5. (probably not time to get to it) Be exposed to empirical evaluation and understand its challenges

Mike’s Paper List from Vis 2017

by Mike Gleicher on October 8, 2017

Ones that I saw that I liked (a lot based on presentation)

  • Visual Exploration of Semantic Relationships in Neural Word Embeddings (J)
    Authors: Shusen Liu, Peer-Timo Bremer, Jayaraman J. Thiagarajan, Vivek Srikumar, Bei Wang, Yarden Livnat, Valerio Pascucci
    Video Preview | DOI
    This is a problem we’re working on, so seeing someone else do a nice job on it is particularly interesting. There are a lot of things we don’t have to do now, since this paper invented the solutions.
  • TreePOD: Sensitivity-Aware Selection of Pareto-Optimal Decision Trees (J)
    Authors: Thomas Mühlbacher, Lorenz Linhardt, Torsten Möller, Harald Piringer
    Video Preview | DOI
  • Blinded with Science or Informed by Charts? A Replication Study (J)
    Authors: Pierre Dragicevic, Yvonne Jansen
    Video Preview | DOI
    A famous study fails to replicate – despite a lot of effort to reproduce it. It’s nice to see things we take for granted challenged. I must admit I was surprised. I am not sure who to trust.
  • Conceptual and Methodological Issues in Evaluating Multidimensional Visualizations for Decision Support (J)
    Authors: Evanthia Dimara, Anastasia Bezerianos, Pierre Dragicevic
    Video Preview | DOI
    A topic I am really interested in – with a first step in thinking about it. It’s testing simple things, but really trying to get at the tasks.
  • Taking Word Clouds Apart: An Empirical Investigation of the Design Space for Keyword Summaries (J)
    Authors: Cristian Felix, Enrico Bertini, Steven Franconeri
    Video Preview | DOI
  • Assessing the Graphical Perception of Time and Speed on 2D + Time Trajectories (J)
    Authors: Charles Perin, Tiffany Wun, Richard Pusch, Sheelagh Carpendale
    Video Preview | DOI
  • Progressive Learning of Topic Modeling Parameters: A Visual Analytics Framework (J) (Best Paper Honorable Mention)
    Authors: Mennatallah El-Assady, Rita Sevastjanova, Fabian Sperrle, Daniel Keim, Christopher Collins
    Video Preview | DOI
    I didn’t see the talk – it was the same time as mine. But I spoke to the author and she went through the talk with me.
  • Nonlinear Dot Plots (J)
    Authors: Nils Rodrigues, Daniel Weiskopf
    Video Preview | DOI
    This is a simple idea that seems quite powerful. I like it because I can imagine using it.
  • Modeling Color Difference for Visualization Design (J) (Best Paper Award, InfoVis)
    Authors: Danielle Albers Szafir
    Video Preview | DOI
    Admittedly I am biased – Danielle was my student, and I worked on early versions of this stuff. But this is a case where careful consideration of details of how experiments are run allow us to build models that can help with common design decisions.

Ones that I was excited about, saw the presentation and didn’t like as much (this doesn’t include ones that I really didn’t like)

  • Visualizing Dataflow Graphs of Deep Learning Models in TensorFlow (J) (Best Paper Award, VAST)
    Authors: Kanit Wongsuphasawat, Daniel Smilkov, James Wexler, Jimbo Wilson, Dandelion Mané, Doug Fritz, Dilip Krishnan, Fernanda B. Viégas, Martin Wattenberg
    Video Preview | DOI
    I don’t doubt this is great work – and worthy of an award. It’s just not about what I thought it was (it’s about tools for manipulating data flow graphs – it really has little to do with ML except that some ML is programmed with way). And it seems like a nice system, but I’m not sure what I am supposed to learn from it (I already know the authors can make nice systems). But I didn’t read the paper.
  • Visual Diagnosis of Tree Boosting Methods (J)
    Authors: Shixia Liu, Jiannan Xiao, Junlin Liu, Xiting Wang, Jing Wu, Jun Zhu
    Video Preview | DOI
  • A Workflow for Visual Diagnostics of Binary Classifiers using Instance-Level Explanations (C)
    Authors: Josua Krause, Aritra Dasgupta, Jordan Swartz, Yindalon Aphinyanaphongs, Enrico Bertini
    Video Preview
  • Clustervision: Visual Supervision of Unsupervised Clustering (J)
    Authors: Bum Chul Kwon, Ben Eysenbach, Janu Verma, Kenney Ng, Christopher deFilippi, Walter F. Stewart, Adam Perer
    Video Preview | DOI
    nice system – but not sure what the takeaway messages were
  • TACO: Visualizing Changes in Tables Over Time (J)
    Authors: Christina Niederer, Holger Stitz, Reem Hourieh, Florian Grassinger, Wolfgang Aigner, Marc Streit
    Video Preview | DOI
    This is a great problem – that surprisingly little has been done with. I am less excited about the solution – I have to think there is something better.

Some that I would have seen, except I was in a different session…

  • The Good, the Bad, and the Ugly: A Theoretical Framework for the Assessment of Continuous Colormaps (J)
    Authors: Roxana Bujack, Terece L. Turton, Francesca Samsel, Colin Ware, David H. Rogers, James Ahrens
    Video Preview | DOI
  • Visual Causality Analysis Made Practical (C)Authors: Jun Wang, Klaus Mueller
    Video Preview
  • Keeping Multiple Views Consistent: Constraints, Validations, and Exceptions in Visualization Authoring (J) (Best Paper Honorable Mention)
    Authors: Zening Qu, Jessica Hullman
    Video Preview | DOI
  • Imagining Replications: Graphical Prediction & Discrete Visualizations Improve Recall & Estimation of Effect Uncertainty (J)
    Authors: Jessica Hullman, Matthew Kay, Yea-Seul Kim, Samana Shrestha
    Video Preview | DOI
  • Warning, Bias May Occur: A Proposed Approach to Detecting Cognitive Bias in Interactive Visual Analytics (C)
    Authors: Emily Wall, Leslie Blaha, Lyndsey Franklin, Alex Endert
    Video Preview
  • Beyond Tasks: An Activity Typology for Visual Analytics (J)
    Authors: Darren Edge, Nathalie Henry Riche, Jonathan Larson, Christopher White
    Video Preview | DOI
  • Analyzing the Training Processes of Deep Generative Models (J)
    Authors: Mengchen Liu, Jiaxin Shi, Kelei Cao, Jun Zhu, Shixia Liu
    Video Preview | DOI
  • Understanding Hidden Memories of Recurrent Neural Networks (C)
    Authors: Yao Ming, Shaozu CAO, Ruixiang Zhang, Zhen LI, Yuanzhe Chen, Yangqiu Song, Huamin Qu
    Video Preview
  • ActiVis: Visual Exploration of Industry-Scale Deep Neural Network Models (J)
    Authors: Minsuk Kahng, Pierre Y. Andrews, Aditya Kalro, Duen Horng (Polo) Chau
    Video Preview | DOI
  • DeepEyes: Progressive Visual Analytics for Designing Deep Neural Networks (J)
    Authors: Nicola Pezzotti, Thomas Höllt, Jan van Gemert, Boudewijn P.F. Lelieveldt, Elmar Eisemann, Anna Vilanova
    Video Preview | DOI
  • A Survey on Visual Approaches for Analyzing Scientific Literature and Patents (T)
    Authors: Paolo Federico, Florian Heimerl, Steffen Koch, Silvia Miksch
    Video Preview | DOI
  • vispubdata.org: A Metadata Collection about IEEE Visualization (VIS) Publications (T)
    Authors: Petra Isenberg, Florian Heimerl, Steffen Koch, Tobias Isenberg, Panpan Xu, Charles Stolper, Michael Sedlmair, Jian Chen, Torsten Moller, John T. Stasko
    Video Preview | DOI
  • ConceptVector: Text Visual Analytics via Interactive Lexicon Building using Word Embedding (J)
    Authors: Deokgun Park, Seungyeon Kim, Jurim Lee, Jaegul Choo, Nicholas Diakopoulos, Niklas Elmqvist
    Video Preview | DOI
  • PhenoLines: Phenotype Comparison Visualizations for Disease Subtyping via Topic Models (J)
    Authors: Michael Glueck, Mahdi Pakdaman Naeini, Finale Doshi-Velez, Fanny Chevalier, Azam Khan, Daniel Wigdor, Michael Brudno
    Video Preview | DOI
  • Assessing the Graphical Perception of Time and Speed on 2D + Time Trajectories (J)
    Authors: Charles Perin, Tiffany Wun, Richard Pusch, Sheelagh Carpendale
    Video Preview | DOI

Some that caught my eye, that I did not see…

  • What Would a Graph Look Like in This Layout? A Machine Learning Approach to Large Graph Visualization (J)
    Authors: Oh-Hyun Kwon, Tarik Crnovrsanin, Kwan-Liu Ma
    Video Preview | DOI
  • DSPCP: A Data Scalable Approach for Identifying Relationships in Parallel Coordinates (T)
    Authors: Hoa Nguyen, Paul Rosen
    Video Preview | DOI
  • The Hologram in My Hand: How Effective is Interactive Exploration of 3D Visualizations in Immersive Tangible Augmented Reality? (J)
    Authors: Benjamin Bach, Ronell Sicat, Johanna Beyer, Maxime Cordeil, Hanspeter Pfister
    Video Preview | DOI
  • Instant Construction and Visualization of Crowded Biological Environments (J) (Best Paper Honorable Mention)
    Authors: Tobias Klein, Ludovic Autin, Barbora Kozlíková, David S. Goodsell, Arthur Olson, M. Eduard Gröller, Ivan Viola
    Video Preview | DOI

Some that I know I should look at later…

  • A Systematic Review of Experimental Studies on Data Glyphs (T)
    Authors: Johannes Fuchs, Petra Isenberg, Anastasia Bezerianos, Daniel Keim
    Video Preview | DOI
  • Functional Decomposition for Bundled Simplification of Trail Sets (J)
    Authors: Christophe Hurter, Stéphane Puechmorel, Florence Nicol, Alexandru Telea
    Video Preview | DOI
  • A Utility-aware Visual Approach for Anonymizing Multi-attribute Tabular Data (J)
    Authors: Xumeng Wang, Jia-Kai Chou, Wei Chen, Huihua Guan, Wenlong Chen, Tianyi Lao, Kwan-Liu Ma
    Video Preview | DOI
  • QSAnglyzer: Visual Analytics for Prismatic Analysis of Question Answering System Evaluations (C)
    Authors: Nan-Chen Chen, Been Kim
    Video Preview
  • Indexed-Points Parallel Coordinates Visualization of Multivariate Correlations (T)
    Authors: Liang Zhou, Daniel Weiskopf
    Video Preview | DOI
  • Sequence Synopsis: Optimize Visual Summary of Temporal Event Data (J)
    Authors: Yuanzhe Chen, Panpan Xu, Liu Ren
    Video Preview | DOI
  • EventThread: Visual Summarization and Stage Analysis of Event Sequence Data (J)
    Authors: Shunan Guo, Ke Xu, Rongwen Zhao, David Gotz, Hongyuan Zha, Nan Cao
    Video Preview | DOI
  • Glyph Visualization: A Fail-Safe Design Scheme Based on Quasi-Hamming Distances
    Authors: Philip A. Legg, Eamonn Maguire, Simon Walton, Min Chen
    Video Preview
  • ARIES: Enabling Visual Exploration and Organization of Art Image Collections
    Authors: Lhaylla Crissaff, Louisa Wood Ruby, Samantha Deutch, R. Luke DuBois, Jean-Daniel Fekete, Juliana Freire, Claudio Silva
    Video Preview
  • Visualizing Big Data Outliers through Distributed Aggregation (J)
    Authors: Leland Wilkinson
    Video Preview | DOI
  • The Subspace Voyager: Exploring High-Dimensional Data along a Continuum of Salient 3D Subspaces (T)
    Authors: Bing Wang, Klaus Mueller
    Video Preview | DOI

From the research assignment…

by Mike Gleicher on October 8, 2017

I just went through Discussion 5 to get a sense of what papers people picked.

At the time I went through, there were 111 “votes” (I didn’t count the number of people). 59 different papers got at least 1 vote. 24 got 2 or more. Here are the ones that got 3 or more votes (with the number of votes):

  1. 8 Modeling Color Difference for Visualization Design
  2. 6 Visualizing Dataflow Graphs of Deep Learning Models in TensorFlow
  3. 5 Bring it to the Pitch: Combining Video and Movement Data to Enhance Team Sport Analysis
  4. 4 Globe Browsing: Contextualized Spatio-Temporal Planetary Surface Visualization
  5. 4 Graphiti: Interactive Specification of Attribute-based Edges for Network Modeling and Visualization
  6. 4 Scatterplots: Tasks, Data, and Designs
  7. 4 Visualizing Nonlinear Narratives with Story Curves
  8. 3 A Virtual Reality Visualization Tool for Neuron Tracing
  9. 3 How Do Ancestral Traits Shape Family Trees over Generations
  10. 3 TACO: Visualizing Changes in Tables Over Time
  11. 3 Timelines Revisited: A Design Space and Considerations for Expressive Storytelling
  12. 3 VIGOR: Interactive Visual Exploration of Graph Query Results
  13. 3 Visualizing Big Data Outliers through Distributed Aggregation
  14. 3 What Would a Graph Look Like in This Layout? A Machine Learning Approach to Large Graph Visualization

 

Discussion 5 – mistake in setup

by Mike Gleicher on October 8, 2017

I made a mistake and did not split discussion 5 into groups. This was not intentional. Hopefully, it didn’t make discussion too unwieldy. I’ll try to avoid making that mistake again.

The Week in Vis: Week 6 (Oct 9-Oct 13)

by gleicherapi on October 7, 2017

Week 6 (Mon, Oct 9-Fri, Oct 13) – Implementation

 

After a week of my travel we should be back to a more “normal” schedule…

Last week while I was away, you did some design exercises and looked at what Visualization research is. I’m curious to see what people thought was interesting (maybe I should have done that before the conference so I could have chosen to go to those talks).

This week, we’re back to design challenge 1. We’re also back to lectures and readings, although this week’s topic is a little weird. We’re going to talk about implementation. For folks who know web programming, you might actually want to tinker with D3. For everyone else, you might not care what’s happening under the hood. I’m not sure how we keep everyone happy.

On Monday, there will be a lecture with an overview of implementation strategies. On Wednesday, there will be a D3 “tutorial” where we’ll poke at some demos a bit and see what’s going on. I am going to have someone who knows a lot more about D3 lead it.

Learning Goals (for this week)

This is an odd week since it’s a place where the needs of different kinds of students really are different. It would be better if I could somehow provide a broader range of tools and implementation strategies, rather than the focus on D3.

  1. Give students a sense of the range of options for implementing visualizations
  2. Give students a notion of how to choose amongst the various implementation options
  3. Give students a sense of what D3 is, and why, and how it provides an interesting set of abstractions for thinking about implementing visualizations
  4. Give the students who will actually want to use D3 some starting points