This is a possible problem to work on for Design Challenge 2: A Visualization Project (Hard Vis Problems): it is choice 3, Tiny Charts.
In this Design Challenge 3 Option, your task is to consider how to automatically make small versions of a standard chart type. The idea is to create a tool that would automatically “adapt” a chart to a smaller size (really, you will be generating a new chart that is effective at smaller sizes).
This project is very much inspired by the Vis short paper Semantic Resizing of Charts Through Generalization: A Case Study with Line Charts. I strongly recommend watching the talk (it was a great talk), and looking at the paper to get a sense of the problem. The paper considers line charts: your project would be to do something similar for another chart type.
No, I don’t expect you to write an award winning paper in a short class project. But, I think you can get pretty far in terms of designing and prototyping.
The challenge is to create an implementation of a basic chart type that can produce a good chart for different sizes, especially as the size gets small. Ideally, your implementation would work well for very small charts, as well as “normal sized” charts. But, since good implementations of normal sized charts are easy to find, you don’t have to worry about that. It is OK to say “if the chart is X by Y or bigger, use Z to make the chart” (where X and Y are numbers like 300 pixels or two inches, and Z is some existing library or tool for making a chart).
In fact, you should use the standard implementation as a baseline. Actually three baselines: (1) have the standard tool make a small chart; (2) produce the chart as an normal size image using a standard tool and downsample it; and (3) produce the chart as vector art and scale it. Depending on the tool, not all of these are possible. However, as part of the assignment you will need to pick a baseline to compare your solution to.
For this design challenge option, you must make a tool that take a data set and a size, and produces a visualization (either image or vector). You will need to provide examples of what your program does (which includes creating testing/demo datasets). But the challenge is not to make a few small visualizations: it is to make a “tool”. This might be a command line program that takes a data file and size information on the command line and outputs a file. It might be a web application where you somehow provide the data (as a URL or file) and size and it creates an image on the page (bonus if you have a live slider so you can change the size and see how things change like the demo in the paper talk). It’s even OK if its a python (or other language) script which takes the information and outputs an image or vector file (although, if you do this, it is pretty trivial to make a command line program).
Originally, I was going to make this option be “programming only” - and I still encourage implementation for this. However, if you choose to make this a non-programming project, you must be very careful to design things such that they could be implemented. You cannot define processes that require human intelligence to perform the semantic size reduction: you must explain how your “design” could be created by an automated process.
The key is that your tool must adapt the visualization to the small size. This means you have to determine what it means to effectively convey information at the smaller size, as well as devise algorithms that do it.
Some things you might want to consider - in no particular order:
- Recognizing the limits of how much data can be shown effectively in the small space is key. You will need to throw information away.
- A key (maybe the key) is to reduce the amount of data shown. But this needs to be done selectively so the most important things are kept.
- Sometimes, distortions are required. Small things sometimes need to be made bigger so they are still visible.
- As sizes get smaller, our ability to resolve colors changes. If things are small, you need to have more distinctive colors. See Modeling Color Difference for Visualization Design
- Things like line thicknesses and text sizes need to be reduced, but they can only be reduced so far.Text that is smaller than a certain size stops being readable, marks must be sufficiently big to be seen, lines really shouldn’t be thinner than a pixel, etc.
- Designs that scale continuously are nice, but not always possible.
- Consider the main, important things that a user would want to see when they look at the visualization. And make sure this is still visible in the reduced size version.
- Generating labels and positioning axis ticks automatically can be a suprisingly hard problem. It is OK to have simple strategies. For small visualizations, there may not be room for labels.
The notion of “normal size” and “small” is hand wavy, at best. Small could mean really tiny (like a 32x32 or 128x128 icon), or small like made for a smart watch display, or small for a small image in a paper, or for when a webpage gets shown on a phone, or a thumbnail. These different applications also suggest different tasks and applications, which might influence design.
For this assignment, consider static visualizations. You can’t rely on the viewer zooming in, or getting a magnifying glass to read tiny marks or letters (assuming you had the resolution to show them).
A hint: think about what the most important things to show might be, and try to develop designs that preserve that. For a visualization, the “most important things” might vary based on task. But you have the option to define that. To use the line graph visualization as an example… the Douglas Peuker algorithm they use preserves the local extrema, which can make things look more jagged than they really are.
A big hint: Watch the presentation for the paper (I recommend reading the paper too) Semantic Resizing of Charts Through Generalization: A Case Study with Line Charts. Look at the references in the paper. There are tons of ideas out there. I am hoping that people pick to do semantic resizing for charts where there aren’t solutions in the literature (you cannot do line charts).
Some other papers that might help:
- Histogram binning revisited with a focus on human perception - might be helpful if you plan to try this for bar charts.
- Exploring the Placement and Design of Word-Scale Visualizations - considers the extreme of small scale, and mainly considers where to place them in text. But it can help you think about very small visualizations. There is a lot of follow up on this work, including this recent one that I haven’t read yet.
- There are tons of references in
What chart type to choose
The paper has already done scaling for line graphs. Your job is to do it for some other standard chart type. Pie charts are probably too easy. But…
Might be too easy (there are some obvious methods) - you could do this, but you’ll need to convince us that your solution is compelling:
- Bar charts (especially histograms)
Probably about the right level of challenge:
- Stacked bar charts
- Tree Maps
- Parallel Coordinates (see Augmenting Parallel Coordinates Plots with Color-coded Stacked Histograms for an interesting idea that might work well at small size)
Might be too hard:
- Node link diagrams
- Node link representations of trees
Might have too many options:
- Scatterplots - there are a lot of approaches for dealing with what to do with scatterplots as the density of points goes up. That is one of the key issues with making small scatterplots.
If there’s another one you want to try, you can ask.
What you must do
These are requirements, and evaluation criteria:
- You must provide a rationale for your design. Explain what your tool does to make sure the visualizations are still effective at small scale. This includes defining what effective is. You should explain your algorithmic choices as well.
- Provide a tool that can work for a variety of data. Document its limits.
- Provide examples that show how well it works. This includes generating good demonstration cases. Compare the results to the baseline, and explain why you think your results are better (or not).
Note: we are open that you might experiment with something that will turn out not to be effective. If you have a design that is plausible, an implementation that seems to correctly realize it (as demonstrated by the examples), but when you assess the results you find they are not as useful as you had hoped (relative to the baseline), that’s OK - providing you give a good explanation. A “negative result” should explain why what you tried was plausible, and why it turns out to be wrong in hindsight.
The Ground Rules
The rules are common for all DC2 options. Some specifics:
- For your proposal (and probably phase 1), you should describe the visualization design (what do you expect the algorithm to do). What does it mean to make an effective small version. What does wrong if you just do things naively?
- You are required to provide example data used to demonstrate and test your tool.
- Comparisons to the baseline are not optional. You can pick a standard tool for making the baseline.