Design Exercise 6-1: Glyphs and Graphs and Sets (oh my!)
The intent of this design exercise is to give you some experience making visualizations for different types of data: using glyph visualizations for high dimensional points, using set visualizations for set comparisons, and using network visualizations. We can apply all of these to the Flight Delay Data Set.
Note: This is a “double” design exercise (6-1 and 6-2 are put together into one). The entire thing will be turned in as one Canvas Survey Design Exercise 6-1: Design Exercise 6-1: Glyphs and Graphs and Sets.
The Flight Data can be viewed in many different ways (depending on what you are doing with it). Previous design exercises (Design Exercise 5-1: Experiment with the Flight Data) and Design Exercise 5-2: Visualizations from the Flight Data viewed it as a simple table. Here, we’re viewing it in several different ways:
- As High Dimensional Data - Each airport (or route) can be viewed as a point in a high dimensional space, with a bunch of different attributes. This is a great place to apply glyph design principles (as in the in class exercise).
- As Network (or Graph) Data - We have nodes (airports) and edges (flights). Many analyses deal with the connections (e.g. if you fly from one place to another, you might traverse multiple edges),
- As Set Data - Each airport or route has a set of airlines that set of airlines that serve it. Each airport has a set of airports you can fly directly to.
The purpose of this assignment is to get you to think about these different data types and solutions to them. We are not asking you to “implement” visualizations on the real data. We want you to think about different types of designs and what might be appropriate.
Unfortunately, actually creating these kinds of visualizations with real data is complicated. We aren’t aware of good standard tools (there is nothing like “Tableau for Graphs” or even “Excel for Graphs”). Trying to program them would require not only doing the design work, but also learning some new tools. Therefore, this assignment is a sketching assignment. We are not requiring you to work with the real data (but try to be realistic).
For most of these problems, I created a simple “baseline” visualization. In several questions, I ask you to come up with a better design and explain why you think it is better than my baseline. If the best design you can come up with isn’t as good as the baseline, you can explain why you think the baseline is better.
A bit more about the data (and terminology)
The reduced data set has 105 airports.
I use the term “route” to refer to an origin to destination pair. So, MSN to EWR is a route, EWR to MSN is a different route. There are 4305 unique routes in the reduced data set.
The term “carrier” is pretty much synonymous with “airline”. There are 16 different carriers in the reduced data set.
For the purposes of this assignment, you can assume that if flights go to an airport, they connect (this is a real over-simplification). You don’t need to worry about the connection times.
There is another weird thing in the data: the 105th airport in terms of number of flights is SFB - which only 1 airline flies to (in this data set).
The problems you need to solve
You will need to create “solutions” for these problems (each is a sketch). As in the previous sketching exercise, you are encouraged to make up values for your “data”.
- Set Problem 1A - Carriers vs. Airports
- Set Problem 1B - Airports vs. Carriers
- Set Problem 2 - Origins and Destinations
- Mid Dimensional Glyphs 1 - Destination Regions
- Mid Dimensional Glyphs 2 - Delay Distributions
- Graph Problem 1 - Number of Airlines
- Graph Problem 2 - Here to There
- Graph Problem 3 - Student’s choice
- Critiques (not really a sketch)
A note on sketching
We do intend for you to sketch (not necessarily implement with real data).
Hand-drawn sketches are encouraged.
You might use your sketch to explain your design. You could give multiple instances of the design (showing what it would look like on different data). You might use text in your drawing (possibly with callouts) to explain how we should interpret the sketch.
For each sketch, there is also a text rationale. You can put your description as part of the rationale (if it isn’t fully explained by the sketch).
Set Problem 1: Carriers and Airports, Airports and Carrier
Version 1A: Each carrier serves a set of airports. This problem has 16 sets, with 105 elements (using the reduced data).
Version 1B: Each airport is served by a set of carriers. This problem has 105 sets with 16 elements.
In a matrix view (the only thing I can do in Tableau), both are captured in the same matrix. (you might transpose the matrix - but its the same)
In case you’re wondering, you can get to 98 other airports (in the top 105) on Delta from Atlanta. You can get to 99 airports from Denver, but only 83 of them from its top airline, United. (United has lots of regional carrier flights from Denver).
Make a visualization for both of the variants of the set problem (1A carriers with airports, 1B airports with carriers). If you don’t think the two problems need different visualizations, make 2 (different) visualizations that work for both.
Of course, a visualization should be motivated by some tasks. You can provide the tasks. For each Visualization, your rationale should explain the visualization (since it is a sketch, it might require more explanation). Justify why you think its a decent set visualization, including why it is better for some specific task(s) (of your choosing) than the baseline.
Some examples of tasks:
- Identify which airports have similar combinations of airlines.
- Identify airports that certain airlines compete (they all serve the airport).
- Identify which airlines you might need to combine to reach different groups of airports.
- Identify differences in where airlines go.
Set Problem 2: Origins and Destinations
Every origin has a set of destinations you can fly to directly. This is a core piece of treating the data as a graph.
If my analysis is correct (I tried doing this assignment myself using the real data), there are 8 airports that connect to 80 or more destinations. (Denver has flights to 99 of the top 105.) So you can consider either the top-10, or the top-8 (those with over 80).
The easiest way for me to visualize this in Tableau was to make a matrix view. A specialized set visualization might be better…
You could have made this matrix for the entire 105x105 set to show the whole “graph” (all connections) - we’ll do that in a moment when we will consider graph visualization.
For now, just consider “where can I get to from the top 8 (or 10) airports”. Or, “where can I not get to from the top airports.”
Make a visualization that is better than the baseline. Again, it’s up to you to think of some tasks to use for the comparison. Some ideas…
- If I had to pick a “base” to get to some cities, which should I choose? For some lists of cities, I might need to pick multiple bases.
- Which cities are harder to reach?
- Are there patterns in which destinations are connected to fewer or different origins?
Mid Dimensional (Glyphs) 1 - Regions
We can treat each airport as a point in high-dimensional space. The most obvious space is 2D position (latitude and longitude), but we can consider other attributes of airports as dimensions.
Imagine if we divide the country up into 4 quadrants - Northeast, Southeast, Southwest, Northwest. Each airport is in one of these 4 quadrants. I’ll color code the quadrants (note: I had a crude hack to put the cities “off the map” somewhere visible - but this was just quick vibe coding).
We could view the circles as 1 dimensional glyphs - using color to encode quadrant.
Each airport has a 4-dimensional vector with the counts of the flights it has to each of the quadrants. I could show those 4-vectors with glyphs. The first one is a pie chart (of the proportions that go to each quadrant). With the second, the angles are fixed, but the radius of each segment is proportional to the number of flights.
Ignore the details - this was a quick vibe coding experiment. Hopefully you can get the basic ideas. (if you’re wondering, the circles are distorted because they are being transformed by the map projection).
You could also imagine using these glyphs in a list of airports.
Come up with a third alternative (to show the flights from each airport to each quadrant).
Give a task that each might be “the best” for.
Right now, the maps of glyphs are non-functional because some areas are so dense with airports (for example, Southern California or New York), the glyphs all pile on top of each other. Suggest a solution for how to address this problem (sketch and explain).
Mid Dimensional (Glyphs 2) - Delay Distributions
This time, we are designing glyphs for routes - to summarize the expected time it will take (to be precise, did take, since this is descriptive) to get from the origin to destination.
The obvious baseline would be one of the distribution visualizations we discussed in class (in the scale lecture - we showed bee swarms, histograms, KDE graphs, violin charts, …). However, we are hoping you can use design principles to create something more specialized to the problem:
- It must scale to about 2000 flights (the most common route in the data set has 1785 flights). It also should work for few flights (many routes have a small number of flights)
- It must work at a small size (think something you could put in a cell of a spreadsheet, or a column of a table).
- It should capture the “perceived reality” of delays: if a flight is a few minutes early or late, a passenger might not notice; if the flight is an hour or more late, it can be a big deal with missed connections and messed up plans for passengers. While these events are rare (the median flight arrives early, 90% of flights are less than 50 minutes late), they are significant.
In your rationale for the design, explain how your design responds to the three questions.
Graph Problem Background and Basics
The flight network is clearly a graph - airports are the nodes, routes are the edges.
Even at the scale of the top 105 airports, the graph becomes a dense hairball.
Look at my discussion of route maps for a discussion of the default designs, and an innovative one that works for certain tasks and data. While the maps on that page are all for a single airline, they should help you think about the tasks and designs for route maps.
The other obvious design would be an adjacency matrix. At the scale of 105 airports, even this is hard to interpret. (The color is for the number of carriers at the airport.)
Re-ordering the matrix makes different things come out but it’s still hard:
Of course, this is just a “quick” adjacency matrix in Tableau - a better designed one would certainly work better.
Node-link designs (or others that uses positions for the nodes like the example in my discussion of route maps), might try to be geographically accurate. Or, they might “distort” the geometry for clarity (while preserving the general layout) - like a metro map. Or they might use layout some other way.
Graph Problem 1 - Connectivity with Airlines
If we look at just the top 10-20 airports (top left corner of the matrix above), things are pretty much fully connected. (the ones missing might be interesting)
Create a graph design that can work for 10-20 airports showing their connection - but, with each route, show the set of airlines (you can limit yourself to 1-5 airlines on a route - if it helps, you can treat things as “5 or more”).
The assignment doesn’t specify tasks. You can describe the tasks that you have in mind with your design rationale. But overall, the viewer should get a sense of which routes have choices of airlines.
Graph Problem 2 - Getting there from here
With the top 105 airports, there are only 8 airport pairs that require 3 flights to get between. Of the 105*105/2 city pairs, 4305 (39.4%) have direct flights, and 6607 (50.5%) can be reached with 2 flights (one stop). Madison happens to be one of those cities with a place that requires 3 flights to reach.
(in case you’re wondering, I did have co-Pilot write me a program to solve the all-pairs shortest path problem)
Create a visualization that shows how to get to all cities from Madison. (you can skip SFB - the one city that requires 3 flights).
You only need to show the “shortest” routes (in terms of number of flights). However, you should show the various different routes of the shortest length. (so if there are routes of length 2, all routes of length 2 should be shown).
Again, you can pick the tasks. More ambitious tasks can lead to more ambitious designs.
From Madison, there are 15 cities to fly directly to. There is one (SFB) that requires 2 stops (3 flights). The rest can be reached in 1 stop. If you want some data (again you are sketching - so don’t worry about the details), msn-shortest.txt has the output of Co-Pilot’s program. (it only lists one possible routing for each route).
Graph Problem 3 - Student’s Choice
Make another graph visualization of the flight network data. You pick the task. You can pick the subset of the data (if you like). You pick the design. Do something cool.
Some ideas if you’re stuck (you can come up with something more creative):
- Of all the paths between Madison and San Francisco, how do they compare in terms of reliability?
- We know we can get from Madison to (almost) all other cities with 1 stop… how do the different major airlines compare?
For the rationale, be sure to explain your task(s) and why you think your design is good for it. Contrast it with the “baseline” designs (the standard route map and the matrix view).
Critiques
To keep in the flight data theme, we will ask you to critique 3 designs from design exercise 5-2. There will be separate instructions on how to do this (you will access the 3 designs by the web and submit the critiques using a web form). Some of your critiques will be critiquing other students (and those critiques will be provided to them) to help them improve.
Grading
There are 9 parts. Each will be scored on a 0-10/11/12 point scale (so getting all 10s gets an A). We might give high As (11) or A+ (12) for things that are truly exceptional.
Our focus is on the ideas in the design - not necessarily your artistic abilities in creating them.
- 9 (B) - a reasonable design with a complete explanation and rationale
- 9.5 (AB) - the design shows some thought and creativity to address the task, or thoroughness in the explanation and rationale.
- 10 (A) - the design shows some thought and creativity to address the task, and thoroughness in the explanation and rationale.
- 11 (A+) - the design has clear creativity with clear rationale
For the critiques, you will get 3 points for each critique that is sufficient, and +1 (or more) for providing useful insights that either explain the excellence of the design, or can help the designer understand what is going wrong.
Mechanics
The sketches will be turned in as a Canvas Survey: Design Exercise 6-1: Design Exercise 6-1: Glyphs and Graphs and Sets (due Fri, Nov 21). For each of the 8 sketches, you will upload a file and provide the rationale in a text box.
The critiques will use a different mechanism for distribution (via the web) and turning in the critiques. We will make an announcement for details.
GenAI Disclosure:
I used GitHub co-pilot to write program for me (using either the GPT-5 or Claude Sonnet 4.5 models).
I had co-pilot write various scripts to analyze the data and the graph/set structure.
I had co-pilot write programs to make various visualizations (including the maps shown here).
I asked Gemini for hints of getting Tableau to make the graphs that I wanted.