Anchoring and Alignment:

Data Factors in Part-to-Whole Visualization

Connor Bailey and Michael Gleicher

University of Wisconsin - Madison

Part-to-Whole Visualization

  • Explore the effects of data, design, and behavior on performance in the example case of part-to-whole data relationships.
  • Pie charts and stacked bar charts with 7 parts where the viewer estimates the value of a highlighted part.
  • Focus on the effects of two perceptual mechanisms on viewers ability to in these charts.

Anchor Values

  • Recognizable shapes could help viewers estimate values more quickly and accurately.
  • Data values at 25%, and 50% are particularly salient and expect an anchoring effect based on these reference values.
  • We apply scale markers at 25% increments in both chart types to simplify the anchor effect and make the chart types comparable.

Alignment Positions

  • The start or end of a shape aligning with a scale marker may similarly help viewers estimate values more quickly or accurately.
  • We consider the case of alignment where either the beginning or end of a part aligns with the 0%, 25%, 50%, 75%, or 100% marker on the chart.

Alignment Hypotheses

$$ \begin{aligned} \mathsf{H}^\text{accuracy}_\text{chart} &\quad:& \lvert\text{error}\rvert_\text{aligned} &\quad<& \lvert\text{error}\rvert_\text{near-aligned} \\\\ \mathsf{H}^\text{speed}_\text{chart} &\quad:& \text{time}_\text{aligned} &\quad<& \text{time}_\text{near-aligned} \end{aligned} $$

We expect estimation of aligned values to be quicker and more accurate than non-aligned values.

Experiment Design

  • We used a within-subjects, stratified randomized experimental design to measure performance across the large space of conditions
  • 60 participants completed 96 questions stratified across conditions.
  • Pre-registration: Experiment design and analysis was pre-registered on OSF
  • Exclusion criteria: To limit the effects of outlier response times or inattentive participants participants were excluded with mean absolute error outside of two median absolute deviations from the median absolute error. Response times were clipped to two standard deviations from the median response time.
  • Measures: We measured responses at integer values and computed absolute error of responses to correct values. Response time was measured from the moment the question was displayed to the moment the response was submitted.

Alignment Results

We found a significant effect of alignment on both the speed and accuracy of estimating values in our results.

Anchor Results

We found a significant effect of anchor values on both the speed and accuracy of estimating values in our results.

Chart Results

We can not confirm any significant effect of chart type on the estimation performance in our results.

Interaction Results

We did not find a significant interaction between chart type and alignment or anchoring in our results, but did find a significant interaction between anchoring and alignment.

Combined Results

Rounding

The effects of rounding can be seen in the response rates of the study participants across the stimuli values.

Distance Effects

The anchor distance effect can be seen in the absolute error of the response values of the study participants across the stimuli values.

Discussion

Design Factors

  • Our results show the impact of design factors on task performance
  • Our results support the conventional wisdom recommending alignment when possible.
    • The significance and size of the effect of alignment on the speed and accuracy of estimation show that more guidance, and stronger preferences toward alignment may be beneficial when designing part-to-whole charts.
    • i.e. not just preferring alignment of the top part in a pie chart but optimizing the layout to maximally align parts with scale markers

Discussion

Data Factors

  • Our results show the impact of data factors on task performance
  • Our results show that anchor and near-anchor values have significant effect on performance, supporting the task model where anchoring is a key mechanism.
    • While designers typically can not and should not choose their data, the strength of these effects show that the data values should be considered during chart design.
    • Designers should know the limits of viewers' abilities especially for values far from anchors. For example, datasets of many small parts may be better presented by other chart types.
    • While data-dependent chart design may be difficult automatic chart generation and recommendation tools could help account for data factors.

Limitations

Part-to-Whole Charts

  • We are focused on the specific task of part-to-whole estimation in pie charts and stacked bar charts
    • This basic task may be a step toward understanding these effects in more complex tasks involving memory or comparison
    • The data-dependent effects we observed may be similar in other chart types, possibly caused by other mechanisms
    • We identify these effects in at least one common case which could have implications in data collection/selection, and chart/experiment design.

Limitations

Additional Effects

  • We do not identify the source of the anchoring effect.
    • We do not assess the importance of the implicit (shape-based) and explicit (scale-mark) anchors or how they may be influenced by scale marks
  • We observed expected large individual differences between viewers and account for them in our modeling but do not explore their connection to the observed effects.
  • We accounted for observed rounding effects in our modeling but do not measure propensity for rounding or its interactions with other variables.

Conclusions

  • Data values (data property), part positions (design choice), and rounding (viewer behavior) are related to the anchoring and alignment mechanisms resulting in a strong effect on estimation speed and accuracy.
  • This work serves as an example of measuring the effects of data and design factors related to particular perceptual mechanisms on task performance in one case.
  • Our approach of trying to understand data-dependent variance can lead to better understanding and guidelines for the design of data visualizations.