Assignment 4

Bad Visualization Example: Morning Rush

Finding bad visualization wasn’t really a problem as there are plenty of them. I picked this from the Time magzine.  The screenshot for the visualization is given below and it tried to show the average commute time for major U.S. cities. One thing not shown in the snapshot is the hover menu that pops-up as soon as you take the mouse over any of the bars. It shows the average delay in numbers.

Critique:

From Tamara Munzner’s guidelines:

(i)            It used color saturation to show quantitative data i.e. average delay time. The interpretation of saturation is shown at the bottom right corner.

(ii)          This is an extremely bad use of 3D representation. The given representation suffers from both occlusion and perspective distortion. The average delay values of many of the cities are hidden behind one another, whereas, the height of the bar doesn’t help with any sort of comparison.

From Tufte’s guidelines:

(i)            They didn’t mention the source of the data and additionally it is not clear from the context as well.

(ii)          Tufte also stress that if visualization is even being published by an organization the name of the individuals who designed it should be mentioned. In context of research paper, I guess it becomes implicit but here it is not.

Overall, this visualization doesn’t provide any information then to act as a graphical look-up (table) for average delay values across major cities. Even if the sole purpose was to offer a nice presentation, I think Time could have done a better job then this.

Good Visualization Example: LiveRAC

This one is from one of the PhD student of Munzner and this work is also related to AT&T vis research group.  The link to the paper is here.

Critique:

There are generally a large number of system monitoring parameters that are of interest to system administrators. Often these parameters don’t form a pattern or correlation among each other.  For example, the large number of input network packets might not necessarily be related to number of processes present in the system or even cluster of systems during a given time period. But, sometime seeing those together and across a cluster might help quickly detect and fix problems. The magnitude of system monitoring information can also pose serious problem So, I think the above visualization does a pretty good job and is an effective use of time series plot.

Each row shows one physical device with columns showing time-series plot for individual parameters. Each column can be sorted and an interval can be highlighted across all columns. One of their design principal was of ‘overview first and then zoom’. Following this each row can be expanded to see the full plots. They have also used color encodings to show the threshold of values in each row. The rows( or devices) that are not fully visible the color value indicates the magnitude of the value for the  given parameter ranging from higher (red) to lower(gray) value. This allow us to easily zoom-in to the set of nodes that are showing higher values for a given parameters and then zoom-in further to see the full details (cool!).

# NetWork Performance Visualization


Sometimes, with visualization, we can enhance the quality of presentation and by doing so, we add value to the

product’s reliability and acceptance by peoples.  Akamai is one of the leader in Networking technologies and they

show their network performance by simple visualization tools. Here is one snapshot. With this tool, we can

intuitively find out the performance of the network for large data transfer.

#2.   Most of the time effective visualization make complex physics easy to understand.

Scientists sometimes have social obligations to make science popular and for humans the easiest way to grasp or understand certain concepts are with visual tools. I have picked up some nice illustrations from Scientific American that shows how beautiful pictures can summarize human knowledge in few pictures.

In common parlance we sometimes use words such as “black holes”, expanding universe etc for which scientists spend their entire life understand the deep mysteries of the nature.  With some visualizations tools, we capture the imagination of young people to attract them toward science.  All great scientists ( example Richard Feynman) were great teachers because they were able to present their ideas with simplified pictures or models.

Bad example:

Spatial Images of Microarray Data

It takes a microarray data structure and creates a pseudocolor image of the data arranged in the same order as the spots on the array. Therefore, this plot shows the spatial distribution of the microarray.

In my opinion, this plot has drawbacks in three aspects.

First, it fails to emphasize the hot region and spots. The function of the spatial image should not be only present the microarray chip, but also point out the regions that different from its neighbors, by listing the location or intensity.

Second, it does not give the coordinate information. From the image, we can just guess the position of points of interest.

Third, it might be better that enlarge the spatial plot and make the scale bar thinner, as well as take use of the space between them.

Good example:

Heat map of Microarray Data (by NimbleGen)

Biology heat maps are typically used in molecular biology to represent the level of expression of many genes across a number of comparable samples as they are obtained from DNA microarrays

This example compares four groups. Here are some good parts in my viewpoint:

  1. Figure A clearly shows the relationship of the four groups, which is control, two treatments and their combination.
  2. Figure A gives suggestion of how to group the DNA segments (rows), which takes advantage of the good feature of heat map.
  3. Figure B compares the levels of different groups. It shows the trend and intensity, although the latter is hard to see.
  4. Figure C shows the expression levels of them. I believe if something is important but hard to be shown in the original plot, it is worthy to add another plot.

My examples are both (at least partly) concerned with where people look. First, my bad example:

In this image, taken from a brain imaging paper, illustrates where one subject looked over time, while being shown several emotionally arousing images. The green shapes indicate “areas of interest” used in statistical analyses, the red circles show where the subject’s gaze lingered (larger circles mean a longer fixation), and the yellow line connects the red circles, roughly indicating the gaze path. The solid red circle is for calibration purposes; it is the size of a one-second gaze fixation.

Here’s a larger version of one frame:

A couple of things in this figure irk me. First, two very different types of information (is this shape an area of interest or a gaze fixation?) are distinguished solely by color. In addition, while gaze position being encoded by position is completely reasonable, gaze duration is encoded by size — I suspect it’s radius, but it’s impossible to tell.

The problem with using size to encode duration is that it implies size or inclusion. All of the gaze fixation points are fundamentally the same size. A better option might be to draw each fixation the same size, and indicate duration with color intensity.

I don’t mind the yellow line particularly; however, one can’t tell either the direction of time along that line, nor relatively how far along in time any one fixation is.

Good

from Time magazine February, 1, 2010 Vol 175 No 4.

I like it used real product images in the plot. It attracts more attention than using a bar with the company name below it. It also lets us to know which company makes which product. But, this visualization can be improved to show that the merge of Cadbury and Kraft will have more market sharing than Mars has. Yes, this is a big problem because that point is what the writer wants to tell. But, somehow I liked the use of real product image a lot. Maybe, as a consumer, I interpret the article with the help of plot as a company selling milka buys a company selling dairy milk and will be bigger than a company selling Mars, instead Kraft buys Cadbury and it will be bigger than Mars Inc. But, for the investors, the company names would matter more than the product names. If I make an improvement to the plot, I would do it by putting Kraft image in a dotted rectangular above Cadbury image and putting an arrow between Kraft image on the fifth column and Kraft image in the dotted rectangular. This would make the height of Cadbury + Kraft will be higher than Mars.

Bad

from Time magazine February, 1, 2010 Vol 175 No 4.

Bad aspects of this visualization are like these.

  1. It will be better if the proxies are ordered in decreasing minutes.
  2. Proxy for text messaging is awkward.
  3. The above paragraph compares the present media consumption to the past media consumption. So, it should have the past media consumption visualization parallel to the current.
  4. Where is reading activity? Maybe reading is not included in media activities but in leisure.

But, there are some good aspects to this visualization, too.

  1. Showing the time as blocks gives us the idea the proportion of each activity in the overall media consumption easily.
  2. Even though the proxy of messaging is awkward, other proxies with the same color encoding in blocks are good and the use of proxies gives us instant concepts.
  3. Putting the actual minutes is good even though it is a little redundant with block counts. The reason is that the user does not have to count the number of blocks when the user want to compare his/her media consumption with the average, which I did.

I chose this visualization as bad because the third bad aspect is about what the writer wants to tell, I guess, and the first bad aspect can be fixed easily and bothers me a lot.

GOOD example:

Newsmap

Newsmap is a treemap visualization tool that depicts which news topics are most intensely being covered at a given moment. The goal of Newsmap is to help news readers finding out underlying patterns in news issue selection. The data is derived from Google news, which has location, section and time metadata and moreover already clusters similar articles into chunks with their own semantic algorhithms. It is intended for use by regular web users rather than trained specialists.

The quantity of a given issue topic is encoded into the size of the box and the font of the title inside it. The section in which the article cluster can be found is encoded into color hue, while the age of the issue is encoded into color lightness. The tool provides browsing by enabling users to select specific sections and countries. It also provides a search window that filters out article clusters with a specific keyword.

I like this visualization because it clicks with the intended audience by keeping its presentation as intuitive as possible. The topics of interests are right there in your face, by visually guiding the focus on the more prominent features. Moreover, it does not overload you with too much metadata because you can’t read the topics in the smaller boxes and fonts, unless you intentionally hover your mouse on them. With a remarkable simplicity, it shows the current landscape of topics in terms of specific issue, theme section and age of news on a single page.

======================

BAD Example:

ArtDiaspora.viz

ArtDiaspora.viz is a art-meets-cultural studies tool to visualize the concept of diaspora, by depicting various relationships between Korean-born artists, current residency and their works. It was presented at the 2002 Kwangju Bienniale, and uses the data from the artworks exhibited there.

This visualization ambitiously tries to include a vat array of data in a circular network diagram. The different classes (title of artwork, artist, country of residence, place of birth, year of birth, year of artwork, form of artwork) are encoded into different colors. Then the nodes are connected with colored edges.

Taking into account that this is presented at an art exhibition, the target audience may have been impressed by its beautiful design and the very idea of visualizing this kind of data. However, I don’t think this achieves its core goal of conceptualizing diaspora, because it is hard to read patterns with too many data categories cramped into one diagram. Artist’s name, country of residency, year of work would have been enough to show a meaningful pattern of talented artists moving out of Korea. Even for that, I don’t really see the need to use a circular network diagram rather than simple histograms and time-series worldmaps. On a shorter note, I don’t see why the category title (e.g. “Date of birth”) connects a edge to every single entity within the category. It makes the overloaded picture even messier.

Bad example

The two graphs show the natural rate of population increase for both developed countries and developing countries. The natural rate of population increase is the difference between the birth rate and the death rate. The purpose of this visualization is to reveal the difference in population growth rate (natural rate of population increase) between developed countries and developing countries.

However, this picture is an example of bad visualization because it doesn’t encode the important information in an effective way:

1) To getat a particular point in time, the viewer needs to calculate the vertical difference between the birth rate and death rate for that particular time, which is not intuitive.

2) The author wished to compare the population growth rates and conclude from these graphs that the population growth rate of developed countries is low and stable, while that rate of developing countries is high. However, by drawing the two graphs separately, it’s not easy to compare the difference.

Good example

http://well-formed.eigenfactor.org/radial.html#/?id=4324

This visualization gives an overview of the whole citation network. The colors represent the four main groups of journals, which are further subdivided into fields in the outer ring. The segments of the inner ring represent the individual journals. In the initial view, the top 1000 citation links are plotted. Selecting a single journal (inner ring) or whole field (outer ring) displays all citation flow coming in or out of the selection.

Benefit:

(1) The visualization is dynamic. If viewer click one journal, the other journals and their citation ship will fade away, and the clicked one will be highlighted.

(2) The ring design gives same emphasis to all journals. Imaging line or other shape, the information at the corner might be ignored.

(3) lines (citationship) between different journals run very smoothly. It gives the viewers a clear observation of which two journals are related.


Visualizing 3-Dimensional line data is an extremely tricky problem. Examples of such data are neuron connectivity information and fluid flow simulations. Below is a visualization of DTI fiber tracts in a human brain.

DTI fiber tracksThis visualization is able to correctly convey the connectivity information and at the same time adds depth cues in order to keep the three dimensional context. The paths are also bundled to give emphasis to general trends. I also like the fact that they use implicit perceptual cues ( the width of the halos around each line) and allow our visual system to perform the hard work of deciding what is in front of what. Previous approaches have tried to use realistic rendering, but because we rarely see such structures in real life, we have a hard time extracting the structure.

Here’s a link to the project page:

http://www.cs.rug.nl/~isenberg/VideosAndDemos/Everts2009DDH

The paper :

http://www.cs.rug.nl/~isenberg/personal/papers/Everts_2009_DDH.pdf

And additional high res images:

http://www.cs.rug.nl/~isenberg/uploads/VideosAndDemos/Everts_2009_DDH_supplemental_normal.pdf

There’s also a couple of link to videos in the project page. I encourage all to take a look at them.

I think that one of the most common problems when trying to visualize data is overcrowding. This can happen whenever one tries to display too many data categories, too many data points, or both. Below is a map from the National Weather Service that displays the current national weather warnings and advisories.

Small map of weahter warnings

The main problem with this map is that it tries to cram way too many types of warnings and advisories into a single view. There are a total of 30 categories, and all are coded by color. Eventually it becomes almost impossible to distinguish between them. For example,  does the coast of Hawaii have a Coastal Flood Warning, High Surf Warning, or both? Further investigation revels that it was a High Surf Waring.  There are even more problems when one considers the interaction that a user might have with the map. For one, clicking on a state and using the drop down menu to select one produce completely different results.

The current map and others can be found here

http://www.weather.gov/

Many of you have already figured it out, but for Assignment 4, we ask that you make postings (not comments). Preferably with the picture of the visualization you are critiquing (but at least a link to it). If you look in the Student Posts category you’ll see examples from people who have figured it out without this hint.