Module 6: Standard Problems (Nov 10-21)
We will look at some standard “hard problems” that come up including hard data properties (e.g., uncertainty, scale), data types (e.g., networks, high-dimensional records), and applications. We will look at some of the standard approaches to address these problems. The design exercises will give more practice at creating and critiquing visualizations.
Introduction
In this module, we’ll look at a few common types of “hard problems” (in terms of data/problem type). We’ll focus on a few (graphs, sets, high dimensional data), and try designing visualizations for them in the design exercise.
Note: We can’t do all the “hard problems” in 2 weeks. Some of the “standard hard problems” we have seen already (scale), and others will be pushed off until the next module.
Summary (and recommended schedule)
The design exercise, readings, and lectures are tightly coupled. And the content survey will ask you about all of it.
Read through Design Exercise 6-1: Glyphs and Graphs and Sets (oh my!) - this is a combined exercise that puts all parts into one. You will want to pace yourself - it’s a lot of small parts all due in one survey at the end of the module Design Exercise 6-1: Design Exercise 6-1: Glyphs and Graphs and Sets (due Fri, Nov 21).
The readings cover the range of topics you will hear about in class (and see in the design exercise). Dimensionality reduction is not part of the design exercise, and is the last of the lectures, so you might want to read that last. You will need it for the content survey.
You can do Seek and Find 6: Graphs (due Fri, Nov 21) any time after the first graph lecture (or reading).
The Content Survey 6: Hard Data Types (due Fri, Nov 21) should be at the end - it refers to the lectures, readings, and design exercise. The Class Survey 6: Module 6 content (due Fri, Nov 21) is also best at the end.
Module Learning Outcomes (Goals)
- Identify network (graph) data and describe standard visualization approaches for it
- Identify high-dimensional (multivariate) data and describe standard visualization approaches for it
- Describe standard approaches for visualizing uncertainty in data
- Describe standard approaches for visualizing set data
- Practice creating and critiquing visualizations
Readings
This week’s readings cover a range of “hard data type” problems. You’ll encounter some of them in the design exercises.
High Dimensional Approaches (Mid-Dimensional)
We’ll divide “high-dimensional” data visualization into two different levels: not-so-high (medium dimensional) and really high dimensional. In medium dimensional data (apprxomately 4-12) dimensions we have a chance of actually showing all the dimensions - beyond that, we need to use dimensionality reduction (below).
I want you to look at a historical survey of “mid-dimensional” approaches to get a sense of all the crazy things people have tried. A few have stood the test of time (e.g., scatterplot matrices and parallel coordinates). Others seem ridiculous. Unfortunately, none of the surveys are complete - and the best one I can find seems to appear and disappear in different versions (and it still misses many different things). When you read through this, consider that most of these ancient (this paper is circa 2001) designs did not pass the test of time.
- (required) Georges Grinstein, Marjan Trutschl, Urska Cvek. High-Dimensional Visualizations. KDD approximately 2001. (web pdf) (url)
Glyphs
Glyphs (small mini-pictures that map different data to different visual features) are a special case of mid-dimensional visualizations. Some specific glyph designs (like Chernoff faces) have been tried, but there is a more modern effort to design effective glyphs. We’ll learn about glyphs mainly by trying it out (in class exercise and design exercise).
These papers discuss packing large amounts of information into small pictures. I recommend looking at one of these for ideas when you need to make your own.
- (optional - but recommended) Rita Borgo, J. Kehrer, D. H. S. Chung, E. Maguire, R. S. Laramee, H. Hauser, M. Ward, & M. Chen. Glyph-based Visualization: Foundations, Design Guidelines, Techniques and Applications. Eurographics State of the Art Reports, 2013. (doi) (web pdf) (url)
- (alternate) Johannes Fuchs; Petra Isenberg; Anastasia Bezerianos; Daniel Keim. A Systematic Review of Experimental Studies on Data Glyphs. IEEE Transactions on Visualization and Computer Graphics, 23(7) 2017.. (doi) (web pdf)
I like this paper as an example of a well thought out glyph design:
- (optional) Eamonn Maguire and Philippe Rocca-Serra and Susanna-Assunta Sansone and Jim Davies and Min Chen. Taxonomy-Based Glyph Design—with a Case Study on Visualizing Workflows of Biological Experiments. IEEE Transactions on Visualization and Computer Graphics, 18(12) 2012.. (doi) (web pdf)
Dimensionality Reduction
You should learn about the mathematics of dimensionality reduction in some other class. But, here are three “interactive tutorials” that I like not only because they give you the intuitions, but also because they use interactive visualization convey their points.
- (required) Matthew Conlen and Fred Hohman. The Beginner's Guide to Dimensionality Reduction. An Idyll interactive workbook. (url) - This is a very basic demonstration of the basic concepts of dimensionality reduction. It doesn’t say much about the “real” algorithms, but you should get a rough idea if you haven’t already.
- (required) Wattenberg, Martin and Viégas, Fernanda and Johnson, Ian. How to Use T-SNE Effectively. Distill Interactive Journal. (doi) (url) - I wanted to give you a good foundation on dimensionality reduction. This isn’t it. But… it will make you appreciate why you need to be careful with dimensionality reduction (especially fancy kinds of it).
- (optional - but recommended) Andy Coenen, Adam Pearce. Understanding UMAP. (url) - I like this as a way to explain the UMAP algorithm. It is a mix of the details, but also the intuitions. It is less important to understand UMAP, but more to get a sense of what these kinds of algorithms do.
Graphs
At some places, they have whole classes on graph visualization. A few readings will hopefully get across the main ideas. We won’t really get into layout algorithms much.
- (required) Tamara Munzner. Arrange Networks and Trees. Chapter 9 from Munzner's Visualization Analysis & Design. (Canvas File) (UW Library) - The book will give you the basics and help you realize there are many alternatives.
- (required) TreeVis.net. (url) - This is a visual survey of alternatives for the special case of trees. Have a look to see a range of possibilities.
- (optional - but recommended) Michael Gleicher. Airline Route Maps: An interesting Solution to a Node-Link Problem. (url) - This used to be an in-class exercise. I will refer to it in the design exercise and content survey.
- (optional - but recommended) Helen Gibson, Joe Faith, Paul Vickers. A survey of two-dimensional graph layout techniques for information visualisation. Information Visualization, 12(3–4), 324–357. (doi) (url) - I recommend that you skim this one to get a sense of the range of algorithms out there for graph layout.
- (optional) Kobourov, S.. Force-Directed Drawing Algorithms. In Handbook of Graph Drawing (pp. 383–408). (doi) (web pdf) - This is a review of the classical algorithms.
- (optional) Tamara Munzner. 15 Views of a Node-Link Graph: An InfoVis Portfolio. Google TechTalks. (web pdf) (video) - Tamara Munzner gave a talk that gets across the point that there are many ways to show a graph. It gets the point across that there are lots of design choices and options. Plus, you’ll get a sense of the person behind the book (although, this was long ago). But, sitting through the hour is a bit much – so it’s OK to just watch a little bit and read through the slides.
Sets
Set type data isn’t as common as the others. The Upset paper is a classic - I want you to look at it as an example of important visualization work. Unfortunately, the setvis interactive/visual survey is no longer online. The survey paper isn’t quite as fun.
- (required) Bilal Alsallakh, Luana Micallef, Wolfgang Aigner, Helwig Hauser, Silvia Miksch, Peter Rodgers. The State-of-the-Art of Set Visualization. Computer Graphics Forum 35(1) (EuroVis '15 State of the Art Report). (doi) (web pdf) (url) - Skimming this survey to get an idea of what is out there is sufficient. You will need ot have some ideas for the design exercise.
- (optional - but recommended) Visual Techniques for Analysing Set-typed Data. Keshif Gallery. (url) - Unfortunately, the visual gallery that went along with the set survey paper is not longer on the web. This visual gallery isn’t as complete, or as good, but you can still see lots of pictures.
- (required) Alexander Lex, Nils Gehlenborg, Hendrik Strobelt, Romain Vuillemot, and Hanspeter Pfister. UpSet: Visualization of Intersecting Sets. IEEE Transactions on Visualization and Computer Graphics 20(12), (Proc InfoVis '14). (doi) (web pdf) (url) (video) - This is a classic example of a good solution to an important problem that has been successful. A big part of their success is that they provided good implementations.
- (optional) Ramik Sadana, Timothy Major, Alistair Dove, and John Stasko. OnSet: Tackling Large-Scale Set Data. IEEE Transactions on Visualization and Computer Graphics 20(12), (Proc InfoVis '14). (doi) (web pdf) (url) - This came out at the same time as UpSet. It didn’t catch on. It’s an interesting contrast.
Lecture Plan
- Monday 1 (Nov 10) - High Dimensional Data
- Wednesday 1 (Nov 12) - Graphs and Sets
- Monday 2 (Nov 17) - Graph Layout
- Wednesday 2 (Nov 19) - Dimensionality Reduction
Assignments
- Design Exercise 6-1: Glyphs and Graphs and Sets (oh my!) - we are combining all the design exercise components into one bigger design exercise with a single hand-in survey Design Exercise 6-1: Design Exercise 6-1: Glyphs and Graphs and Sets (due Fri, Nov 21), and some other pieces that are turned in in a different way.
- Seek and Find 6: Graphs (due Fri, Nov 21) - asks for a network visualization.
- Content Survey 6: Hard Data Types (due Fri, Nov 21) - refers to the lectures, readings, and design exercise.
- Class Survey 6: Module 6 content (due Fri, Nov 21) - as usual, we have one of these.