Critiques

Good Visualizations:

Source: Cover of Independent, U.K., Jul 21, 2006

http://www.smashingmagazine.com/2008/01/14/monday-inspiration-data-visualization-and-infographics/

Explanation and background:

The visualization is simply the cover of the U.K. magazine Independent on July 21, 2006. On the day before, U.K. and the U.S.A. has rejected a UN call for immediate ceasefire in the Israel-Lebanon conflict, while most of the world backs up the UN call. Though the fact is simple and can be described by a few words, the editors chose a more active way to distribute the NEWS, while implying their attitude towards the fact that the U.S. and U.K. are insisting their own way.

Deconstruction:

The title (also acts as the capture) is a question asked for the audience. What’s new happened in Middle-East? Who backs immediate cease fire? The visualization lists the nations with the answer “Yes” and the nations with the answer “No”. Using national flag to represent a country is really a prevailing but effective method. In this specific circumstance, the number and area of nation flags creates an inevitable comparison which implies the decisions of the U.K. and U.S. government are against the whole world.  Now, even people without sufficient knowledge of politics and current events will know that few countries but U.K. and U.S. didn’t back the ceasefire call, which would be very possible to attract the eyes of the audience.

Critique:

This is one of the best cover pages of magazine I’ve ever seen. There is two major reason why I like this visualization.

First, you might argue that the visualization didn’t give out much information. However, in spite of the simple information, the sharp contrast between the left and right side makes very strong impression, while also influent most of the audiences emotionally. Though the visualization is simply indicating some facts, I believe most of people will extract following message from it: A. Most of the world would back the ceasefire call in the middle-east; B. the U.K. and the U.S. government are against the ceasefire call. Questions will soon arise in some of the audiences: why are the U.K. and U.S. against the world? Is it another “national benefit” issue, or is it another event in which they want to display their power?

Second, it’s really simple. By simple, I mean the implementation of the visualization is really easy. It’s really amazing an easily made visualization can take such good effect.

I haven’t found any bad thing for this visualization. As the cover, it properly achieved its function: to attract audience, and to present the most important content in the magazine.

Bad visualizations:

Source: http://www.willisms.com/archives/2005/03/the_american_em.html

Explanation:

This is a picture from the WILLisms NEWS website. It served as a figure for the comment article “ The America Empire”. The author is willing to use this figure to show the audience the military expenditures of the 16 nations who has the highest military cost. (Of course, the author’s major point is to point out America has the largest military expenditure, whose absolute value is almost the total summary of the other 15 top nations.

Deconstruct:

The figure is created based on a pie diagram which using different color to stand for different nations in this military expenditure competition, except for the U.S. For the U.S. partition, the author uses the national flag in a mesh background to emphasize. On the left, there is a color chart to correspond the nations and the colors in the pie diagram.

Critique:

The good thing is that the figure has a proper title and indicates its source.

However, the visualization is so bad that it against several basic principles for information visualization. Though the information it represents is straightforward, it cannot hardly be understand by the audience. It’s bad in following aspects:

  1. The colors for each nation are too many that they cannot be identified clearly. The figure attempts to challenge a person’s cognitive limit by the use of the 16 different colors.
  2. The use of the national flag with meshed background to emphasize the U.S. is confusing, and this abrupt style transforming makes the national flag hard to recognize by the audience.
  3. The details of the lines and color blocks of the pie diagram are not processed carefully. (See the jagged edge of the pie partition!) An ugly picture, comparing to a beautiful one, would be much harder for the audiences to accept.

In summary, the visualization Examining it with the principles for visualization design in Tufte paper, we can find that the paper does have sufficient data content, an integration of all the data and comparisons, however, the comparisons are too much so the major comparison is buried by redundant information. Based on the 4 level model of Munzner, we can confirm that the problem occurs at the abstraction level and visual encoding level, since the description for the problem and the algorithm is unlikely to be wrong.

A Good example:

The periodic table.

http://www.aip.org/history/curie/periodic.htm

Mendeleev's periodic table

Why do I think it’s a good visualization?

– It has been one of the most successful visualizations of all time. When it was first conceived, it exposed underlying trends which allowed scientists to design experiments and procedures to isolate most of the missing elements fairly rapidly, and it gave deep insights into the structure of atoms, allowing some startling predictions about undiscovered elements. The power of the periodic table is that it makes fundamental concepts so easy to grasp that it is used in middle schools to teach concepts that at one time were inaccessible to many of the most brilliant scientists in the world.

– It supports all kinds of queries very efficiently – as long as you know something about the properties of the element, you can go right to its entry, and get its atomic mass, electronegativity, symbol, name and other properties.

To use Munzner’s language, it works on several levels:

– It solves the right problem – the outer electron shell properties of each element are made explicit by their location, while the overall shape of the table shows patterns that govern all types of atoms. The proof is that many scientists were able to use the insights in the table to discover new elements.

– It uses the right abstraction – it is the same abstraction of electron shells that scientists came up with to describe electrons – that the number of valence electrons and filled shells can determine an element’s chemical properties.

– It uses the right visual encoding – position is first and foremost, and informative empty spaces are left in. Color can be used to encode other things, like the metal-nonmetal boundary, radioactivity, melting point, or other groupings.

– There isn’t really an algorithm to speak of…

Tufte’s principles:

– Comparisons – With the table you can compare elements, rows, columns, metals and non-metals, and the larger blocks corresponding to which subshells are outermost.

– Causality / mechanism – It’s got it in spades. Not to belabor the point, but the electron shell structure dominates most of the chemical and physical properties of elements, and it’s right there in the table.

– Multivariate analysis + Integration – Outer and inner shell electrons are encoded in position, and other properties are either shown in color, or listed directly in the table.

– Documentation – It all started with Mendeleev, but since then it’s been reworked by countless researchers, and is one of the most trusted and authoritative of all visualizations of data.

– Content – I think this principle is outside of the scope of visualization, but in any case, the content is a sound scientific theory, backed up by partial evidence in the beginning, and much more evidence since then.

A bad example

http://worldprocessor.com/catalog/world/images/011.jpg

Statistical Challenges

Why do I think it’s a bad visualization?

– It’s a globe – you can only see one side at a time

– It looks like all nonsense – with other globes on the same site, http://worldprocessor.com/catalog/world/, it at least makes sense to use a globe because they are showing data that has some geographic locality and “global” (not a pun) scope. Why would you draw these things on a globe? Where do they get these connections? What is the underlying story? What patterns are there in this?

– To be fair, this is probably not meant as a serious visualization, but as a kind of artwork. A lot of modern art seems to be trying to convey the feeling of helplessness in the face of the overwhelming complexity of modern life, but as a clear visualization of the author’s ideas, I think this one fails pretty badly.

Munzner’s principles:

– It’s not clear what problem they are solving, or if there is even an attempt to solve a problem here.

– The abstraction is a network between nebulous concepts involving various aspects of society and life in general. It’s only the least bit clear what the arrows mean – for instance, “standard of living” and “happiness” are only linked through the confounding influence of “energy production”.

– Encoding – position is not meaningful, color is not used, orientation varies without any apparent meaning, making it harder to read the text snippets, darkness (saturation) of the text varies in no discernibly meaningful way, and, being a globe, you can only see one side of it, and not without some distortion.

Tufte’s principles:

– Let’s just say Tufte would leave the room, and leave it at that.

Stinks
The National Spatial Data Infrastructure (NSDI) is a… well, what is it? Let’s find out.
Step 1. Google for NSDI. The very first result is
National Spatial Data Infrastructure — Federal Geographic Data …
Executive Order 12906 calls for the establishment of the National Spatial Data Infrastructure defined as the technologies, policies, and people necessary to …
Step 2. Go to www.fgdc.gov/nsdi/nsdi.html where we see
“Executive Order 12906 calls for the establishment of the National Spatial Data Infrastructure defined as the technologies, policies, and people necessary to promote sharing of geospatial data…”
So, NSDI is a collective noun for tech, policies and people.
Step 3. Not knowing where to go further, I click on “Home” to land at http://www.fgdc.gov/ where I see
“This nationwide data publishing effort is known as the National Spatial Data Infrastructure (NSDI).”
Hmmm… NSDI is a publishing effort. Moving on…
“The NSDI is a physical, organizational, and virtual network…”
So, now the NSDI is a magical, shape-shifting, material as well as intangible network!
Step 4. Further down, there is a link to the “Components of the NSDI” so I click on http://www.fgdc.gov/components where I see a block diagram of NSDI which shows that NSDI has 6 components.
NSDI components
For some reason, 4 components are stacked on top of each other, one wraps around two sides of the 4, and the sixth component is in the background, forming a backdrop for all other five. Why it is like so, I have no idea. Is there a significance to “Standards” wrapping around the two sides of the stack of “Clearinghouse/Portal, Metadata, Framework and GEOdata”? And, why is “GEOdata” have GEO in caps? Is that an acronym for something? Why is “Partnerships” italicized? What does that signify.
Oh, there is much more that is wrong with this picture, but I will stop here. As an attempt to convey what the premier national initiative would be to build a national capacity for geospatial data and technologies for one of the largest and certainly the richest and most advanced country in the world, the one that invented most of this field and technologies, this whole presentation simply stinks.
Now, for some fresh air
I love walking. I firmly believe that my quality of life is very closely tied to whether or not I can walk to where I want to go. I detest and fear a life where I would be dependent on cars. So, when walkscore.com came by, I was in heaven.

Walkscore of my neighborhood

In a very simple way, in a very simple interface, with just one box and no fiddling, it clearly and easily helps me discover the walkability of a neighborhood. It has everything that would make Tufte smile… multivariate analysis, the complex made simple, content that is king, ability to compare and contrast neighborhoods, a moving picture, and a clear explanation of “how it was done.”
Of course, it also has the best use of today’s technology. Just imagine how Minard’s march of Napoleon would have looked like if he had Google Maps, jQuery and Processing!

I think it is a common misconception that visualization needs to be complicated to convey a number of variables, and the two examples that I have chosen address this issue.  The first example is a UV index map from the EPA that is updated regularly on their website.

The bad:
There are a few features with the EPA’s UV index forecast that are worth pointing out.  First, the color scheme is very misleading.  We customarily associate red with danger and ‘cooler’ colors with safety.  This map has the red hues squarely in the middle of the scale with blue tones encapsulating both ends of the spectrum.  In fact, the two colors assigned to the lower and upper end of the scale are very similar and could almost be used interchangeably.   There is no logical progression to the colors, and the map author would have been better off using saturation (with one hue) in order to show increased UV levels.  (The full sized version can be seen here.)

The good:
This visualization shows the additional Crayon colors that have been created since 1903.  Color is used to represent actual color, which could have been accomplished less dynamically a number of different ways.  The timeline emphasizes the number of additions within the last 20 years, while showing the smaller proportion of changes before 1958.  A table showing the same information would not show the diversity as effectively. (The full sized version can be seen here.)

And now for one that I do like:

This one comes from a presentation by Bilge Mutlu, an assistant professor in our own CS department.

Yes, OK, this figure appears to do the same trick of indicting longer gaze fixations with bigger dots. I’m not a big fan of that. However! That’s a sideshow to the larger story: the role of gaze position in conversation.

The large timeline at the bottom shows both where the main speaker is looking over time, as well as when each person is speaking.

In general, this graphic isn’t anything flashy or using super sweet edge clustering; however, it does an excellent job of showing many simultaneous, related events. My main quibbles are with the patterned area in the speech (what does that mean?), and that even with the time scale at the top, it’s hard to get an intuitive sense of how much language fits into this figure. If space permitted, it might be interesting to actually transcribe the spoken text into the speech bars.

Bad Visualization Example: Morning Rush

Finding bad visualization wasn’t really a problem as there are plenty of them. I picked this from the Time magzine.  The screenshot for the visualization is given below and it tried to show the average commute time for major U.S. cities. One thing not shown in the snapshot is the hover menu that pops-up as soon as you take the mouse over any of the bars. It shows the average delay in numbers.

Critique:

From Tamara Munzner’s guidelines:

(i)            It used color saturation to show quantitative data i.e. average delay time. The interpretation of saturation is shown at the bottom right corner.

(ii)          This is an extremely bad use of 3D representation. The given representation suffers from both occlusion and perspective distortion. The average delay values of many of the cities are hidden behind one another, whereas, the height of the bar doesn’t help with any sort of comparison.

From Tufte’s guidelines:

(i)            They didn’t mention the source of the data and additionally it is not clear from the context as well.

(ii)          Tufte also stress that if visualization is even being published by an organization the name of the individuals who designed it should be mentioned. In context of research paper, I guess it becomes implicit but here it is not.

Overall, this visualization doesn’t provide any information then to act as a graphical look-up (table) for average delay values across major cities. Even if the sole purpose was to offer a nice presentation, I think Time could have done a better job then this.

Good Visualization Example: LiveRAC

This one is from one of the PhD student of Munzner and this work is also related to AT&T vis research group.  The link to the paper is here.

Critique:

There are generally a large number of system monitoring parameters that are of interest to system administrators. Often these parameters don’t form a pattern or correlation among each other.  For example, the large number of input network packets might not necessarily be related to number of processes present in the system or even cluster of systems during a given time period. But, sometime seeing those together and across a cluster might help quickly detect and fix problems. The magnitude of system monitoring information can also pose serious problem So, I think the above visualization does a pretty good job and is an effective use of time series plot.

Each row shows one physical device with columns showing time-series plot for individual parameters. Each column can be sorted and an interval can be highlighted across all columns. One of their design principal was of ‘overview first and then zoom’. Following this each row can be expanded to see the full plots. They have also used color encodings to show the threshold of values in each row. The rows( or devices) that are not fully visible the color value indicates the magnitude of the value for the  given parameter ranging from higher (red) to lower(gray) value. This allow us to easily zoom-in to the set of nodes that are showing higher values for a given parameters and then zoom-in further to see the full details (cool!).

# NetWork Performance Visualization


Sometimes, with visualization, we can enhance the quality of presentation and by doing so, we add value to the

product’s reliability and acceptance by peoples.  Akamai is one of the leader in Networking technologies and they

show their network performance by simple visualization tools. Here is one snapshot. With this tool, we can

intuitively find out the performance of the network for large data transfer.

#2.   Most of the time effective visualization make complex physics easy to understand.

Scientists sometimes have social obligations to make science popular and for humans the easiest way to grasp or understand certain concepts are with visual tools. I have picked up some nice illustrations from Scientific American that shows how beautiful pictures can summarize human knowledge in few pictures.

In common parlance we sometimes use words such as “black holes”, expanding universe etc for which scientists spend their entire life understand the deep mysteries of the nature.  With some visualizations tools, we capture the imagination of young people to attract them toward science.  All great scientists ( example Richard Feynman) were great teachers because they were able to present their ideas with simplified pictures or models.

Bad example:

Spatial Images of Microarray Data

It takes a microarray data structure and creates a pseudocolor image of the data arranged in the same order as the spots on the array. Therefore, this plot shows the spatial distribution of the microarray.

In my opinion, this plot has drawbacks in three aspects.

First, it fails to emphasize the hot region and spots. The function of the spatial image should not be only present the microarray chip, but also point out the regions that different from its neighbors, by listing the location or intensity.

Second, it does not give the coordinate information. From the image, we can just guess the position of points of interest.

Third, it might be better that enlarge the spatial plot and make the scale bar thinner, as well as take use of the space between them.

Good example:

Heat map of Microarray Data (by NimbleGen)

Biology heat maps are typically used in molecular biology to represent the level of expression of many genes across a number of comparable samples as they are obtained from DNA microarrays

This example compares four groups. Here are some good parts in my viewpoint:

  1. Figure A clearly shows the relationship of the four groups, which is control, two treatments and their combination.
  2. Figure A gives suggestion of how to group the DNA segments (rows), which takes advantage of the good feature of heat map.
  3. Figure B compares the levels of different groups. It shows the trend and intensity, although the latter is hard to see.
  4. Figure C shows the expression levels of them. I believe if something is important but hard to be shown in the original plot, it is worthy to add another plot.

My examples are both (at least partly) concerned with where people look. First, my bad example:

In this image, taken from a brain imaging paper, illustrates where one subject looked over time, while being shown several emotionally arousing images. The green shapes indicate “areas of interest” used in statistical analyses, the red circles show where the subject’s gaze lingered (larger circles mean a longer fixation), and the yellow line connects the red circles, roughly indicating the gaze path. The solid red circle is for calibration purposes; it is the size of a one-second gaze fixation.

Here’s a larger version of one frame:

A couple of things in this figure irk me. First, two very different types of information (is this shape an area of interest or a gaze fixation?) are distinguished solely by color. In addition, while gaze position being encoded by position is completely reasonable, gaze duration is encoded by size — I suspect it’s radius, but it’s impossible to tell.

The problem with using size to encode duration is that it implies size or inclusion. All of the gaze fixation points are fundamentally the same size. A better option might be to draw each fixation the same size, and indicate duration with color intensity.

I don’t mind the yellow line particularly; however, one can’t tell either the direction of time along that line, nor relatively how far along in time any one fixation is.

Good

from Time magazine February, 1, 2010 Vol 175 No 4.

I like it used real product images in the plot. It attracts more attention than using a bar with the company name below it. It also lets us to know which company makes which product. But, this visualization can be improved to show that the merge of Cadbury and Kraft will have more market sharing than Mars has. Yes, this is a big problem because that point is what the writer wants to tell. But, somehow I liked the use of real product image a lot. Maybe, as a consumer, I interpret the article with the help of plot as a company selling milka buys a company selling dairy milk and will be bigger than a company selling Mars, instead Kraft buys Cadbury and it will be bigger than Mars Inc. But, for the investors, the company names would matter more than the product names. If I make an improvement to the plot, I would do it by putting Kraft image in a dotted rectangular above Cadbury image and putting an arrow between Kraft image on the fifth column and Kraft image in the dotted rectangular. This would make the height of Cadbury + Kraft will be higher than Mars.

Bad

from Time magazine February, 1, 2010 Vol 175 No 4.

Bad aspects of this visualization are like these.

  1. It will be better if the proxies are ordered in decreasing minutes.
  2. Proxy for text messaging is awkward.
  3. The above paragraph compares the present media consumption to the past media consumption. So, it should have the past media consumption visualization parallel to the current.
  4. Where is reading activity? Maybe reading is not included in media activities but in leisure.

But, there are some good aspects to this visualization, too.

  1. Showing the time as blocks gives us the idea the proportion of each activity in the overall media consumption easily.
  2. Even though the proxy of messaging is awkward, other proxies with the same color encoding in blocks are good and the use of proxies gives us instant concepts.
  3. Putting the actual minutes is good even though it is a little redundant with block counts. The reason is that the user does not have to count the number of blocks when the user want to compare his/her media consumption with the average, which I did.

I chose this visualization as bad because the third bad aspect is about what the writer wants to tell, I guess, and the first bad aspect can be fixed easily and bothers me a lot.