Comments on: Reading 4: Evaluation https://pages.graphics.cs.wisc.edu/765-10/archives/422-reading-4-evaluation Course web for CS838 Spring 2010, Visualization Wed, 22 Feb 2012 22:15:57 +0000 hourly 1 https://wordpress.org/?v=5.7.4 By: watkins https://pages.graphics.cs.wisc.edu/765-10/archives/422-reading-4-evaluation#comment-117 Wed, 10 Feb 2010 17:20:52 +0000 http://graphics.cs.wisc.edu/Courses/Visualization10/?p=422#comment-117 In reply to punkish.

I don’t see the irony here. I might have seen some irony while reading the “Cherry-Picking, Evidence Selection, Culled Data” section, where Tufte implies that enthusiasm for ones work is related to poorly designed studies, but he’s trying to make a different point in the “Punning” section. The author of the excerpt of “Painting Outside the Lines” either intentionally or carelessly uses a word that can be interpreted multiple ways, then uses the meaning that is most convenient for him to make his point. So it’s not necessarily about the tone of the writing, just the vocabulary.

Also, in the interest of not cherry-picking, it’s probably worth pointing out that North’s article had a typo too. (“They place an undo burden on the evaluation designers…”)

]]>
By: Nate https://pages.graphics.cs.wisc.edu/765-10/archives/422-reading-4-evaluation#comment-116 Tue, 09 Feb 2010 16:11:15 +0000 http://graphics.cs.wisc.edu/Courses/Visualization10/?p=422#comment-116 In reply to punkish.

Yup. And it’s pretty brazen of him to complain, in his opening paragraph of ‘Corruption’, of rhetorical ploys in presentation.

]]>
By: Nakho Kim https://pages.graphics.cs.wisc.edu/765-10/archives/422-reading-4-evaluation#comment-115 Tue, 09 Feb 2010 15:00:43 +0000 http://graphics.cs.wisc.edu/Courses/Visualization10/?p=422#comment-115 In reply to Adrian Mayorga.

I strongly agree to the point that all three roles should be taken, though I interpreted them in more straightformard terms – evaluate your visualization by examining your visualisation process systematically(Munzner), check whether it tells the right story(Tufte), and evaluate how it fared with the users(North).

]]>
By: faisal https://pages.graphics.cs.wisc.edu/765-10/archives/422-reading-4-evaluation#comment-114 Tue, 09 Feb 2010 13:36:23 +0000 http://graphics.cs.wisc.edu/Courses/Visualization10/?p=422#comment-114 In reply to hinrichs.

Use of qualitative methods often make us uneasy. But, I guess some of the coding methods are fairly rigorous for developing deep insights/theories from the raw observations like thinking out loud or similar ethnographic data. In this context I think coding is being used in a very limited sense as to whether an insight by user is correct or not as judged by domain experts.

I am not sure if we can call visualization tool as a not user interfaces (or a human computer interfaces). I think these tools have similar usability concern and concern for human expectation and value assigned that make them as much an interface as anything else we might use for interacting with computers.

]]>
By: Jeremy White https://pages.graphics.cs.wisc.edu/765-10/archives/422-reading-4-evaluation#comment-113 Tue, 09 Feb 2010 13:01:13 +0000 http://graphics.cs.wisc.edu/Courses/Visualization10/?p=422#comment-113 There were three vastly different approached to visualization this week, but which one of these approaches will result in the best visualization?

Tufte would have you believe that proper integration and data density will efficiently drive the message. Essentially, he stresses telling the story by providing the answers.

Munzer provides a framework through which a systematic evaluation of tasks and data will establish direction for design. Nesting processes ensures that procedural order is covered in its entirety.

North posits that domain expertise will guide the visualization through insight. He would assert that the best person to tell a particular story is someone who has experience or vast knowledge of the subject matter.

If it were obvious which approach was the most effective, I suppose this seminar could be reduced to an afternoon workshop. One thing remains clear, however, visualization requires an understanding of the underlying message and the target audience.

]]>
By: faisal https://pages.graphics.cs.wisc.edu/765-10/archives/422-reading-4-evaluation#comment-112 Tue, 09 Feb 2010 12:57:55 +0000 http://graphics.cs.wisc.edu/Courses/Visualization10/?p=422#comment-112 The insight characterization used by North is a very useful idea that can actually be used as a validation method along with formal lab experiment in Munzner’s nested model. Although, Munzner seems to focus more on informal style of usability studies and highlights a significant problem with lab test as their failure to capture task mismatch given that participants will be doing tasks as designed by experiment designer. The North work address this problem by measuring the insight on the task accomplished by domain scientist for a given tool and can address some of the concern regarding lab experiment.

I read the North journal paper stating their experimental methodology. The rigorous experimental methodology adopted for visualization evaluation can be very valuable given that one has enough resources to do so. It might not be necessary for the tool smith to actually compare a lot of tools against given datasets (thus reducing number of participants).

For Tufte’s readings, I read the first chapter stating different visualization principal using Minard’s map example but skimmed through the second one. Unlike other Tufte’s reading in this course so far, I found this to have more rhetorical details than actual message he wanted to convey.

]]>
By: Jim Hill https://pages.graphics.cs.wisc.edu/765-10/archives/422-reading-4-evaluation#comment-111 Tue, 09 Feb 2010 11:26:25 +0000 http://graphics.cs.wisc.edu/Courses/Visualization10/?p=422#comment-111 In reply to lyalex.

I didn’t notice that there was no visual pertaining to the direction of the army. However I didn’t have any issues determining what the direction was. Could it possibly be that there was enough context to imply the direction?

]]>
By: hinrichs https://pages.graphics.cs.wisc.edu/765-10/archives/422-reading-4-evaluation#comment-110 Tue, 09 Feb 2010 09:14:08 +0000 http://graphics.cs.wisc.edu/Courses/Visualization10/?p=422#comment-110 Tufte – on Evidence corruption. One of my all-time favorite quotes is, “the best way to tell a lie is to tell the truth so unconvincingly that no one will believe it.” Tufte hit on that early on, but not in so many words.

It’s interesting that this chapter focused mainly on obfuscating language – double meanings, over-reaching, hiding the evidence by summarizing, and every shade of confirmation bias. While I am whole-heartedly on board that language is often ripe for abuse and one should always demand precision wherever possible, I wonder how many of these ideas translate directly into visualization topics.

Double meanings – Do arrows, lines, colored shapes or scattered dots offer multiple interpretations? Probably they do in some cases – there are multiple visual channels, some of which encode real data, but the rest have to be populated some other way. Probably the best is to make the data uniform in any channel which is not meant to be interpreted, though that’s not always possible – for instance, suppose you decide not to use position – you can’t put all of the graphics objects in your visualization on the same spot!

Summarizing – Definitely this is a danger in visualization. One has to take care that information being discarded so as to fit the data into a 2-dimensional space is non-essential.

Confirmation bias – As long as the mode of visualization is chosen before seeing how it looks, one should be free of this bias, however, sometimes you might try something out that doesn’t work for other reasons.

Probably the main reason for the prevalence of these kinds of intellectual dishonesty is that there is no institution that can be created which will guarantee fair and impartial presentations of information, perfectly. (There are imperfect ones, such as the academic review process.) Nevertheless Tufte is happy to give some stern warnings. 🙂 Another reason would be that it is often done unconsciously – even with the best of intentions, producing a complex communication which is free of distortion is a strenuous undertaking.

Chris North, on measuring visualization insight.

A personal opinion – I am generally inclined to never believe self-reports on subjective criteria. Thus, a study that asks users if they liked a certain visualization, or if they would use it are not very convincing to me. Data on whether they actually used one tool over another is far better. Better still is some metric on how successful users are at certain tasks when using a certain tool.

I once read Jef Raskin’s book on the Humane Interface, and it had some interesting things to say about quantitatively measuring how good certain interfaces were. Some things included Shannon-information content (for instance, a popup which only asks you to click “ok” has an information content of 0,) time taken to perform certain tasks, number of operations taken to perform certain tasks, number of errors made in performing certain tasks, etc.
http://www.amazon.com/Humane-Interface-Directions-Designing-Interactive/dp/0201379376/ref=sr_1_1?ie=UTF8&s=books&qid=1265705100&sr=8-1

UI is not the same as data visualization, though if the visualization is interactive, then it’s pretty close.

I think Chris North’s point is that, sure, you can measure how effective people are at doing some manual tasks, which is all well and good, but how does the visualization help make someone a better scientist, engineer, or detective? This is a much more ambitious question. While there probably are lots of gainful things to do, I wouldn’t expect a full “reduction to practice” for something of this magnitude. The main pitch is to let users explore the data, and record their insights as they go, either by writing them down, or saying them out loud.

Coding – I think this is the heart of it, because this is how you tell how good the insights were. This is basically the entire problem, swept under one giant rug. Ok, we need to measure something complex and subjective – so we’ll have someone assign numbers to it according to their subjective notions of what constitutes depth. North writes,

“Coding converts qualitative data to quantitative and is
inherently more subjective, but supports the qualitativeness of insight. Significant objectivity can be maintained through rigorous coding practices.”

Yes, you can have rigorous, but still subjective practices, which leave the door open to all of the drawbacks of purely unquantified measures. Numbers are a good way to represent things, but they have to mean what they say.

Clustering – Who does the clustering, and how?

Errors – Who is to say what is an erroneous insight?

Domain experts – I think this is a good idea that compensates for a lot of the weaknesses above. In any field, there are inherent controversies and ambiguities, and resolving them is no easier than deciding whether someone has gained an insight from exploring the data. The solution is to nominate some people as experts whose opinions can arbitrate, wherever there is a consensus among them. If it’s good enough for the whole of academia, and leadership in general then it’s probably good enough for evaluating visualizations.

In any case, I think it’s worth pursuing, because visualization is definitely NOT the same as UI design. Just the same, I’m wary of drawing comfort from made-up numbers without being extremely cautious. This is in fact one of the criticisms of current methods which use yes-no questions which can lose a lot of nuance.

]]>
By: lyalex https://pages.graphics.cs.wisc.edu/765-10/archives/422-reading-4-evaluation#comment-109 Tue, 09 Feb 2010 08:34:02 +0000 http://graphics.cs.wisc.edu/Courses/Visualization10/?p=422#comment-109 In reply to ChamanSingh.

I still think Minard’s graph quite good. To my understanding, he tries to show the historical matching map with his own attitude: losses of a war.

I’m trying to answer some of Chaman’s questions:
(1) How does Multivariate analysis of 6 dimension useful
in knowing what we want to know. How latitude and longitude
useful in the information ( At least not clear from his writing)

It’s a map. The primary goal of Minard might be to present the historical information about this “death match”, so latitude and longitude is used to mark the French Army’s path.

(2). Where is the direction of army movement in the graph ?
Wouldn’t it be simple to draw two pictures separately on the
same graph.

The direction is indeed not explicit. However, the tan and dark routes are the path of the same entity (the French army), drawing them together is clearly to show the loss in the battle as well as loss by the winter.

(3) I got completely confused about the direction of dates.
Couldn’t make out who is attacking whom ? Dates decreasing from
left to right.

The date only corresponds to the dark route. The French is defeated at the gate of MOSCOW, and retreating ever since. The tan route is showing the French Army’s match to beat the Russians, and after they are defeated, they are sufferring from the attack of Russians and the cold winter, thus only date on the dark route is necessary.

]]>
By: lyalex https://pages.graphics.cs.wisc.edu/765-10/archives/422-reading-4-evaluation#comment-108 Tue, 09 Feb 2010 08:21:22 +0000 http://graphics.cs.wisc.edu/Courses/Visualization10/?p=422#comment-108 As I’m not yet (might not be one in the future, too) an expert in VIS, I consider Tufte’s paper more impressive, as it uses a single example “Minard’s map” to explain the principles of visualization. A example of a good map is really helpful to understand the author’s points. The principles listed is very completed, at least it covers the importance of my domain science (natural science)fields. Comparison is exactly why we always use a “control group”, and multivariate analysis is why we use 3-D graphs to show our modelling. However, in the relavance of principles part, the author didn’t clearly identify the “hierarchy” of principles. In my opinion, the context, document and integration of evidence should be include in every graph in a series of natural science papers, but comparison and casuality can be implict.

Munzner’s paper created a nested model of four levers. (1) characterization of the task (2) Abstract level (3) visual encoding and (4) algorithm for implementation. I really consider this paper as a review paper as it listed and cited so many previous papers and studies. Though without detailed samples and a little bit hard to understand for me, the nested model and upstream – downstream invalidation method is really effective, and the idea of seperating the levels will surely be helpful within the visual design process. The only “defects” I can feel about the paper is that: first, a detailed example that obeys the levels of the model would be every good for the audience to understand the paper and would be a direct proof of the effectiveness of the model; secondly, the authors list a lot of previous papers which do not completely fit in the model but work, which left us this question: will the nested model be accurate and concise enough?

North’s paper is kind of trival to me. My impression is only that the authors attacked the betchmark appoarch for evaluation, but the alternative evaluation method the authors proposed seems only an more qualitive version with far less feasibility. I learned and accepted the drawbacks of the betchmark evaluation, but what can we do to avoid that? I consider the authors’ reply is ambiguous.

]]>