Grading of the ATUS Visualization Assignments

Page content

This grading “rubric” is for Design Exercise 9: Visualization Hand-Ins. If you turned in question ideas (for DE7) and drafts (for DE8), we asssume that they were acceptable (if not, it will probably show in your final visualizations). Similarly, we are not doing close assessment of the critiques of the critiques (DE8b) - except in a very small number of cases, the critiques were at least acceptable.

General Strategy

Grading was done in two passes:

  1. I looked at some of the charts in detail (about half) as a calibration. For many of these some scoring was done. If a chart was scored, you will get the feedback. This was looking at individual charts. See “Per Chart Assessment” below. The numerical scores are not really used. These scores are on a 1-5 scale (1=bad, 0=not scored)

  2. I looked at each students whole assignment (5 visualizations and description), and scored them as described in “Assignment Grading” below. During this process, I looked at any per-chart assessments that were done. These scores are on a -2 to 2 scale.

There is a final score on the -2 to 2 scale.

People generally did well. (or I was generous). The median score was 1.

The rough translation…

  • 2 = great assignment, worthy of an A
  • 1 = good assignment, not an A on its own (roughly a soild AB)
  • 0 = acceptable assignment (generally, a high B - doesn’t rule out an AB for the class, but below the quality of work I’d expect from an AB student).

Per Chart Assessments

We assessed some of the charts in detail. We couldn’t do all of them in detail - we did look at them all, but we often looked at them as a set per student. We did the small number of random assessments to calibrate. If we did some of yours, we are giving you the feedback.

In addition to notes there are some “measures”. These are generally on a scale of 1-5 (1=bad, 5=exceptional). A 0 means not applicable (or not scored).

The numbers are pretty variable - lots of variance. They mainly serve as a check during the “whole assignment” phase (which is a separate pass).

Some general observations

  1. It is difficult for me to distinguish “careful selective minimalism” from a minimal attempt to show something. Sometimes carefully picking a few things to show makes an interesting story come out. Sometimes I see it. Sometimes, I might miss it.

  2. The subjectivism about story is weird - because sometime nothing comes out, and that’s what’s interesting. These “null stories” really need to be called out (since it’s about the viewer having a correct expectation to match against).

  3. Part of what makes a story (even a null story) is the factor of surprise (or lack of it). Often this is not called out (by the title, caption, or even discussion). If its not called out, it is really hard and somewhat random if I will pick up on it.

Subjective rating

Normally, we’d use the -2 to 2 scale (0=expectation), but instead we center on 3 (so 3=meets expectations, 5=great, 1=doesn’t meet expectations). We kept outliers on the 1-5 scale.

Categories

  • (BC) Basic Chart Types (includes basic faceted charts)
    • see details (at what point do good details move to next category?)
  • (CB) Complex Basic Charts (includes faceted versions) A. distribution charts (error bars, box plots) B. basic charts with extra variables C. multi-heatmaps
  • (CC) Compound Charts (not just multi-faceted)
  • (UD) Unusual Design

Small multiples might be BC or CC (not consistent) At some times I mixed BC and CB together (so some CB were marked BC).

Multi-Variate

This was a requirement.

  1. 2 or fewer variables (e.g., distribution over 1D)
  2. barely multi-variate (e.g., two line charts)
  3. somewhat multi-variate
  4. brings many variables
  5. richly multi-variate - good use of variables to tell story

Mutli-Chart Connection (a kind of detail)

  1. connection not supported (1.5 minimal connection)
  2. connection at story level
  3. limited connection
  4. designs connect
  5. clearly made choices to connect

A minimal connection might be the use of common colors, even though it is difficult to read more into the chart combination.

Details

  1. Substandard
  2. Minimal (OK labels, some insufficient elements)
  3. Good (no obvious flaws) or Mix (some positive elements, but some problems)
  4. Well done (good choices to bring out story)
  5. Exceptional

How this plays out for basic charts

  1. just basics
  2. some attention to details
  3. mix (some bad choices with some good ones)
  4. well done (good choices and details)
  5. very well done (particularly interesting choices, embellishments)

Story

  1. No story (e.g., Data Dump)
  2. Story only comes out from description (and post-hoc) or doesn’t come out clearly
  3. Story comes out
  4. Clear Story - identified and told
  5. (exceptional) Interesting story - well told
  • 1.5 Dump shows expected trends
  • (exceptional 5) Interesting story - well told
  • Clear Story 4
  • Story suggested - but need to dig 3
  • Data Dump 1
  • Unclear if there is a story 1
  • Needed Explanation from description 2
  • Null Story (suprise is there is not so much to see) 2, or well-told null story maybe 2.5
  • Try to pull a story from the data, but the design/details kindof hide it (so “good try”) 3.5

Bad Things

Various codes of things to point out (these are ones that come to mind - we don’t always remember to put all of them).

  • A. Difficult to see differences because of different axis scales
  • B. Unclear part/whole
  • C. Smoothing/binning issues
  • D. Not filtering error codes or null groups
  • E. differences not necessarily big/highlighted
  • F. Very small quantities make inference unclear
  • G. Overdraw, graph compression make it hard to see things
  • H. Not planned to a page
  • I. Difficult to see differences because of design
  • J. Encoding error (line chart across categories)
  • K. Use of mysterious codes on axes
  • L. Different group sizes make the comparison hard (since its absolute counts not percentages)

Good Things

Various codes of things to point out (these are ones that come to mind - we don’t always remember to put all of them)

  • Z. choice to show range/variance
  • Y. correct chart choice

Assignment Grading

This was done after the individual vis passes. It is more holistic (looking over the set of 5 visualizations and the writeup).

The overall Score

-2 (well below expectations) to 2 (well above expectations, worth of an A).

Generally things went into broad buckets.

The small differences from scores (e.g. .9 or 1.1) mean this would in the bucket of the score (1), but a note that it’s just a little off from the rest in one direction or the other.

The overall score (which is the grade for the assignment) cannot be computed from the details. It is strongly correlated with the averages over the details, but considers everything (the note is probably the most important of the details).

Diversity of designs

  • -2 = all the same simple charts
  • -1 = all the same non-simple charts
  • 0 = limited diversity (e.g., all similar category)
  • 1 = used multiple chart types
  • 2 = effectively tried a variety of things

Stories / Design / Details

These three are not distinct, and are quite subjective. They parallel the detail scores, but consider the whole set and may consider the writeup more. Things may be skewed towards one that is particularly good (good things are over-weighted).

Story: This tried to assess “does a story emerge from the different visualizations?”

  • 1 = there are stories, not always clear

Design: Does the choice of chart type and major choices fit the story and lead to effective telling?

Details: Are the details (titles, captions, color choices, axis labels, callouts, …) good?

  • -2 = lacking in major ways
  • -1 = missing some important things
  • 0 = general has everything
  • 1 = good choices
  • 2 = exceptional choices, details contribute to story telling and overall impression

These three get at the overall quality of the visualizations.

Writeup

An overall assessment of the writeup. It can contibute to the assessment of the visualizations. But for this, we check if you articulate a question / story, and a rationale for your choices.

2 = explains stories, gives rationales for designs; note: sometimes “has a rationale” gets credit, even if the reasoning is suspect. This can be a “the student tried”.

Alternate Design

  • -2 = no alternate design
  • -1 = not clear that it is different designs for same thing (or sufficiently different, or not really compared)
  • 0 = reasonably related
  • 1 = two approaches, with comparison
  • 2 = exceptional reasoning about possibilities

Class Question

  • -2 = did not note a class question
  • -1.5 = noted multiple class questions, but none of them actually answered the specified question
  • -1 = clearly does not address the question
  • 0 - at least in the spirit of the question
  • 1 - adresses the question
  • 2 - provides a good answer to the question

The questions were:

  1. Balance of samples
  2. Distribution of time in pandemic
  3. Distribution change with income
  4. What do people do as they stop working

Some notes:

  • Looking at a specific change for #2 or #4 usually gives a good but not great answer. The idea was to identify what changed, not zoom in on a particular change.
  • A basic stacked bar to “data dump” for 2-4 is OK, but doesn’t really highlight the story.