(readings due Tuesday, Feb 23rd)
In this reading, we’ll start our exploration of human perception with an eye towards visualization. Perception is a big topic – there are several courses on it at the university, so we can (at best) hope to scratch the surface.
The primary (required) reading is:
- Chapters 1 and 2 in Visual Thinking For Design. Colin Ware’s take on it is interesting.
Another great survey is a web-based thing by Chris Healey. This one is nice because it includes some applets and demos that show off some of the suprising pre-attentive processing facts. The survey covers more than the first two chapters of Ware (like it covers Color), but seeing some of these topics before we discuss it in class is a good thing (color is a big topic!).
- Perception in Visualization. by Christopher Healey. Web Page.
A recommended (but optional) reading is the Chapter from the 559 textbook. If you don’t have a copy, enough of us do that you can borrow one. This is much more of a “basic facts about perception” thing, and it covers many of the more advanced perception topics (like depth) that we won’t get to in Ware’s book for a few weeks.
- Visual Perception by WIlliam Thompson. Chapter 22 of Fundamentals of Computer Graphics (by Shirley, et al). (it’s Chapter 21 of the 2nd edition).
As usual, please post at least one comment on what you’ve learned. One question you might want to address: what was the thing that most surprised you about how we see?
{ 12 comments }
Visual Thinking:
Chapter 1: Visual Queries:
***************************
This chapter has many interesting facts and observations
1. Visual thinking is all an act of attending which is both surprising and doubtful. If that is true, then I am curious to know, why in the cognitive science “pre-attentive” word came ?
2. The eyes as a biological lens, creates an inverted image, but we see the image correctly, why we have evolved that way ?
3. Half of the brain power is directed to processing less than 5% of the visual world. I am not sure what is the rest 95%, is it that we don’t see or don’t pay attention to it ?
4. The visual system is capable of holding only three objects at one time.
5. Visual thinking is a “pattern finding” process and the this system to perform this task is multicomponent, where each components extracts and solve a simple problem. There is no “Central Processing Unit” in the brain, everything is distributed.
6. For visual system redoing cognitive operations are far easier than remembering.
7. A Good visual thinkers have good error-checking procedures.
Chapter 2: What we can easily see
**********************************
1. The author says that “Pre-attentive ” word is a misnomer and suggests
substituting this word as “tunable”. I agree with the author that
“Pre-attentive” is not a good choice of the word, but in my opinion,
“tunable” is also an ambiguous word in this context. We need better vocabulary to describe the phenomenon of “instant finding” some patterns
( less than 200 ms).
2. Detecting motion is complex mechanism and periphery of visual system is
extremely sensitive to motion than central fovea.
3. Overuse of motion can create worst form of visual pollution.
4. To enable effective visual search, a design should have both large scale and small scale structure.
***********************************************************************
Paper : Perception in Visualization:
Author: Christopher Healey
A systematic experiments and observations for pre and post attentive
processing makes this paper interesting to read.
Five theories of Preattentive Processing.
(1) Feature integration theory
(2) Texton Theory
(3) Similarity Theory
(4) Guided Search Theory.
(5) Boolean Map Theory.
One interesting observation that this paper make is that “Nonphotorealistic”
images are more effective, appropriate and far more expressive than equivalent photographs. I think I agree with this observation and I think excessive use of color, depth, texture create complex and ambiguous setting which require more attention. Most of the non-photorealistic images use first or second order statistic ( corners, line, contours etc ) to represent the object and as per Texton theory, these patterns are easy to search and analyze.
Chapter 1 can probably be summed up by saying that perception is a weighting of our pattern recognizing hardware and the task at hand. The task weights different pattern receptors more heavily based on what it requires.
The experiment described in the first chapter seems really non-intuitive and my wife didn’t believe it when I told her. But it does make sense that the world acts like an external memory and we really can’t hold that much of it in our brains. I wonder if the secret to an efficient life is to make the world conform to your version of the memory so that you don’t spend as much time querying the environment. It seems really efficient of our brains to just use our eyes as an interface to reality.
I’ve always been amazed at the lack of a black spot where the blind spot should be. The book says that the brain doesn’t even know it’s there but I don’t think that’s true. It’s more like the brain knows exactly where it is and compensates for it. It’s reasonable to say that when both eye’s are open the brain can use information from one to fill in the blind spot of the other, but even when one eye is closed, there is no black spot. True, you can’t see certain things, but even when I couldn’t see the letter B in the book, I could still see that the page whas white.
The fovea is also an interesting concept. It’s very true that when I look at my desk, I can’t describe any one thing accurately, but I can produce a list of everything that I see in my periphery. I think there must be some other process that deals with the periphery that the book hasn’t expounded on because it seems like you can still pick up a lot of information from it.
I think chapter 2 shows that all the time I spent with Where’s Waldo books was pretty useless. It was a very useful chapter though and laid out one of the cornerstones of visualization: making things pop out. The most important think said in this chapter was on the topic of picking channels that make things pop out. The trick is to not encode everything in one channel, but to use three or four different channels to differentiate things. This was big for me because up until now I’ve been thinking in terms of “groups” of things where the group is encoded with one channel. Maybe it’s better to differentiate members of each group with different channels.
So the real goal is to make a Where’s Waldo book that can be read in under 10 seconds.
Chapter 1 Visual Queries:
This chapter introduces how people perceive via eyes. We make visual queries every times when we search for visual information. As a result, tracking the visual queries is a crucial point for designers.
The fovea is a new knowledge to me, and eyes make visual queries by pointing them at interesting points. Visual queries are a series of searches for particular patterns. The act of perception is determined by two kinds of procedures, bottom-up and top-down. When doing visual queries, brain can operate to solve the problems by a set of nested loops.
The analogies in this chapter, by viewing eyes as digital lens, brain as computer, help understanding those organs’ function in visualization.
Chapter 2 What can we easily see? :
Visual searching is a common activity for all seeing since we keep reassessing the visual world every time when we look. The patterns that show pop-out effect includes many features that include pre-attentive effects. Colors, shapes, sizes, etc. can be tunable characters. Motions can enhance visibility. Too much motion is one of the worst forms of visual pollution, but carefully using motions is a good technique.
Perception in Visualization
Five theories are introduced in this paper. When talking about Guided Search Theory, the concept of ‘top-down’ and ‘bottom-up’ is mentioned in this paper and they are part of the framework of guided search. In Boolean Map Theory, the process of visual search is divided into two parts, selection and access. The former involves choosing objects to be viewed and the latter determines what information can be withdrawn by viewer.
An interesting point in this paper is that in the experiment by Wolfe, scientists got the conclusion that sustained attention to the objects tested did not make visual search more efficient. It is important when scientists present their data. The conclusion implies that in searching for specific data value, the display may not be as useful as expected. This also changes my original thought about how to make data seen.
The Ware readings were very insightful in terms of the delivery of the analytical sciences underlying current perceptual theory. The concepts of a “visual language” is very interesting in the juxtaposition of linguistic hierarchies and visual hierarchies demonstrating that labeling one visual language as being better than another is not necessarily a valid statement given the variety of cultural influences underlying arbitrary visual constructs.
Additionally, Gibson’s theory on affordances, which states that perception is based on directly understanding only possibilities for action, provides an interesting twist on the interpretation of the science of perception. Previously, I’d always taken for granted that perception involves only those objects that we “see,” essentially using the physical definition of vision. I’d never taken the time to consider what in a sense is the purpose underlying perception, let alone consider this as a definition of the concept.
Finally the discussion of the significance of texture brings yet another dimension into the realm of perception that is very easy to overlook. However, given the visual impact of lighting and texture to the perception of shape and object classification, it makes sense that the texture of a visualization can provide another data dimension in order to exploit the perceptual system through visualization techniques.
Healey’s paper really appears to expand on the themes of sensory (preattentive) versus arbitrary (postattentive) processing. What really stood out in his analysis, however, was how many different possible mappings of the two perceptual processing system have been developed. Despite this large number of mappings, the essential themes of preattentive and postattentive processing seem to be common across all models. In general, these themes appear to suggest that more significant features in a visualization should tend to be encoded in manners to play on the preattentive processing system, whereas more complicated features are more likely to be successfully perceived in the postattentive domain.
**Sorry if any of this appears new, the readings were done from Ware’s Information Visualization book roughly corresponding to the assigned chapters of Rethinking Design.
Chapter 1 of Ware’s book explains the basic visual perception mechanism of the human eye, which has only a narrow and limited perceptual focus at a given moment. It involves distributed processing in the brain, which consists of three nested loops (pattern testing-pattern search-problem solving). Thinking in visualization terms, the narrow focus and pattern finding perception of the human eye should be taken into account and design elements should not present too much important information to process at once. Also the most relevant parts should be recognizable in a narrow ‘fovea’.
Chapter 2 deals with how some visual elements “pop out” from a bunch of others. Explaining what we can see easily, he talks about visual channels that do not differ much or those which stand out. Reading this chapter, I felt the need for organizing visual elements so that crucial patterns are readily discernable while noise does not accidentally pop out. Maybe controlling for excessive numbers of encoding channels could help.
The Healy paper follows logically to chapter 2 of Ware, by explaining why the things that pop out do so. He introduces five theories of pre- and post-attention in low-level perception: feature integration theory, texton theory, similarity theory, guided search theory, and Boolean Map Theory (although he doesn’t tell which is the most dominant one today). I thought that all the theories and his follow-up explanations with color, texture, motion, and nonphotorealism carry a common lesson, which is not to oveload the viewer’s visual information processing capacity by focusing on important channels.
Chapter 1 of Ware provides an overview of the function, strengths and shortcomings of the human visual system. I found this chapter particularly interesting because it takes some amount of creativity and ingenuity to evaluate the shortcomings in your own perception. This reminded me of a fairly well-known video I watched in an undergrad psychology course (http://www.lockjawslair.com/2005/04/22/gorillas-in-our-midst/) that made me realize how little I must actually see. But there are advantages to human vision. I think the comparison of the eyes with digital cameras made the differences between the active human visual system and the passive data collection of a camera more apparent.
My favorite part of chapter 2 was the short section on motion. Animation is something we haven’t covered yet in class, but it seems important. This brief analytic overview was a good introduction to the topic. The rest of the chapter was more or less a repackaging of the information in chapter 27 of the graphics textbook.
I was surprised at just how many features are being calculated in the visual cortices – the million or so fibers of the optic nerve leading to 5 billion or so neurons in V1 & V2, being used to extract millions of features – computer vision really pales next to that.
The interesting part for me was just seeing the examination of what is and is not detected without conscious effort. In particular, that simple conjunctions of features don’t jump out the same way that single features can tells us a little bit about what the primary visual centers are doing, but another question is, suppose a feature jumps out – where does it jump out to? I think he did mention that there is no single control center in the brain, which makes sense, so I guess the answer is that anything that jumps out does so by jumping ahead in the where / what pathways via a shortcut.
The attention devoted to eye movements was something I hadn’t expected, but it makes sense in that the center of focus has an enormous impact on what you’re seeing. It’s almost like the world is your hard drive, your eyes are the read head, and the various clues around are like the inodes and lookup tables of the file system – which has no overt organization in the real world, except what we make. I doubt if there are any useful insights from file systems with regard to better organizing visualizations, but the metaphor is kind of interesting to me at least.
Ever since the moonwalking bear video went viral [1], change awareness videos have become very popular on YouTube [2], including the experiment cited by Colin Ware in which a directions asking tourist is swapped with another person while the direction-giver, the subject of the experiment, doesn’t notice the swap [3].
This is a fascinating subject that proves its point very effectively. If we know what we are looking for, we can spot it, but we pay little attention to that we are not looking for. This becomes even more pronounced when we don’t know what we are looking for. Differences in color, shape, boundary, motion can act separately or together to differentiate between objects, as eloquently described by Healey. I’ve noticed my cats can’t see a thing if it doesn’t move. It is really the motion that catches their eyes.
Ware’s numbers on the calculations that go on as our eyes go about seeing, and our brains go about perceiving, were fascinating. Once again, the computing metaphor in how our brain analyzes what we see (the visual bandwidth being the highest) comes through.
[1] http://www.youtube.com/watch?v=Ahg6qcgoay4
[2] http://www.youtube.com/results?search_query=change+blindness
[3] http://www.youtube.com/watch?v=vBPG_OBgTWg
The information provided on perception will be very helpful when deciding coding for our own visualizations. The chapter 1 by Colin is nicely summarized in conclusion. The design is hard in a sense that while designing one can not have a fresh eye look on the final artifact. Following a design from inception makes it harder to judge if it is has achieved its goals e.g. improved speed of visual queries etc. The fix is to know the analytic principal of the design. The transition from an unskilled designer to a skilled on is possible and that is to know the precise principles that skilled designer follow intuitively.
The readings from chapter 2 and the web-article follows that nice motivation by chapter 1. The main point in these two readings was to understand features that stand-out around certain type of noise. The intelligent controlling of visual channels (color, shape, size, orientation, motion etc) is critical in an effective visual design. Additionally, knowing the relationship between visual channels and whether the search is top-down or bottom up is also important. If we are looking at a visualization with a task in mind very different visual channels will stand-out while suppressing others. But this pop-out and suppression given a top-down search can also be influenced by stereotype in search strategies.
Visual Thinking:
Chapter 1: It gives out several basic physiological concepts, and then brings up the Bottom-up and Top-down view of the perception act. Then, the implication for design is clear ” that visual queries are processed both rapidly and correctly for every important cognitive task the display is intended to support. A distributed cognition concept is also pointed out that no central unit for cognition in the brain.
Chapter 2: The content is a little bit overlaped with Christopher Healey. It also deals with the low-level feature analysis first. Triesman’s hypothesis is the main model the author adopted.
One important thing is the nest loop model which appears in both chapters. The two different versions of the maps includes the outer problem solving loop, the middle eye movement loop and the inner pattern testing loop.
I like the perception in visualization very much, a very good review paper that lists the Preattentive processing and then postattentive vision, then the model for change blindness, then the elements in perception. To me, it’s more like that the acctual preattentive processing is a combined process of the first four theorical models.
Although I intuitively already knew this, on of the most surprising things about how we see is how small the area where we can see in detail really is. This fact forces us to constantly move our eyes to try and grasp the whole of what we are looking at. Perhaps even more surprising is the amount of stuff that we can still see outside of this region of attention. There are several of these pre-attentive (or tunable as Ware refers to them) things that we are able to pick up on almost instantly.
The second reading neatly summarizes several theories and explanations on this phenomenon. I personally liked the Boolean Map Theory the most. It provides a good explanation of how users may conduct a particular visual query. This coupled with the feature hierarchy can give us tools to intelligently design visuals to make the most important information stand out first.
I was surprised that the different visual channels (form, size, color, orientation) are analyzed in V1 area which is low level and that’s why the use of different channels can pop out when used in contrast to the environment. The fact that the use of multiple channels will intensity the contrast was interesting.