Topic:

The topic that we plan to explore in this project is how to compare multiple network visualizations. While techniques to address this issue have a wide variety of potential applications, there is not a substantial amount of existing literature directly addressing this problem. Thus far, we have a few potential ideas for implementation, but plan to elaborate on these and develop others over the course of the project.

Desired Outcomes:

The outcome of this project will be the implementation of multiple proposed solutions to the network comparison problem. Because of time limitations and lack of existing solutions for guidance, we will likely simplify the problem set to series of small networks. The details of the implementation, such as language and development package, have yet to be decided.

From this project, we hope to gain a better understanding of graph embeddings and techniques for improving readability as it applies to graph comparison. As we’ve seen, graph and network style visualizations are very common in practice. We hope to better understand how techniques that we’ve covered in class can help to create better means of comparison.

Reading List:

Graham and Kennedy. Exploring Multiple Trees through DAG Representations. IEEE InfoVis, 2007.

Frishman, Y. & Tal, A. Online Dynamic Graph Drawing IEEE Transactions on Visualization and Computer Graphics, 2008.

Frishman, Y. & Tal, A. Dynamic Drawing of Clustered Graphs Proc. IEEE Symposium on Information Visualization, 2004.

C. Collberg, S. Kobourov, J. Nagra, J. Pitts, and K.Wampler. A system for graph-based visualization of the evolution of software. In SoftVis ’03: Proceedings of the 2003 ACM symposium on Software visualization, 2003.

Ogata et al. A heuristic graph comparison algorithm and its application to detect functionally related enzyme clusters. Oxford University Press, 2000.

Hoebe and Bosma. Visualizing multiple network perspectives. ACM, 2004.

Major Layout Algorithms. http://www.yworks.com/products/yfiles/doc/developers-guide/major_layouters.html

Erten et al. Simultaneous Graph Drawing: Layout Algorithms and Visualization Schemes.

Time Table:

    • Week 1:
      • Develop Project Plan
      • Compile Initial Reading List
      • Begin readings
    • Week 2:
      • Complete readings
      • Brainstorm potential solutions
      • Select most promising solutions
      • Select implementation language/package
      • Create initial data set
    • Week 3:
      • Implement solutions
      • Test over additional data sets
    • Week 4:
      • Prepare report and presentation

Output & Data Sources:

We hope to implement several solutions to the network comparison problem. The data sources used for this visualization will likely be small, derived data sets. Hopefully there will be potential to apply our solutions to real-world data sets. However, since the purpose of this project is for exploration of general technique, integrating real-world data visualization is not currently part of the project plan.

Description:

My project is to visualize the hidden Markov chain structure of data set from educational study. The variables of the model are discrete, and the parameters are estimated via Bayesian model.

Data was provided by my friend working in Educational Test Service, along with some old data sets. They provide the statistical model so what I need is to use their models instead of analyzing data by myself.

Goal:

The result is for statisticians in educational field, so I hope my work will be able to understand by people in this domain. The visualization will describe the HM chain, parameters, important transit states and probabilities.

Initial reading:

1. JIAN ZHANG, MARKOV CHAIN AND HIDDEN MARKOV MODEL.

This reading is to understand the basic idea of fundamental HMM.

2. Gerard Rambally, A Hybrid Visualization Hidden Markov Model Approach to Identifying CG-Islands in DNA Sequences.

This reading is to learn how to explain HMM to people in other domain, like biology and genetics.

3. Valeria De Fonzo, Filippo Aluffi-Pentini and Valerio Paris, Hidden Markov Models in Bioinformatics, Current Bioinformatics, 2007, 2, 49-61

This reading provides a good example to visualize HMM with emphasizing probabilities in the model.

4. Some coding skill. I think it is important for me.

Schedule:

Week 1 (04/16): Try to implement the reference related to the data and model. Read the papers to understand HMM.

Week 2 (04/23): Provide a visualization of the data, which at least has the basic features of HMM. If possible, I would get some feedback about what to do to improve it.

Week 3 (04/30): I hope to get some more visualization derived from the initial one and compare them.

Week 4 (05/06): Finish the implementation and make a function/package/toolbox that could be used for later analysis. Write a report.

Description:

This project will investigate the mothods of producing stereo images in different areas of media with a focus on scientific data.  It will also investigate the relative value of stereo when compared to other depth cues such as shading and occlusion.  The project will culminate with a simple application that shows how stereo compares to other depth cues when both are present.

As a possible addition, I would like to have a number of different people try my software and comment on the relative usefulness of stereo.

Desired Outcomes:

I wish to produce several things including: a sumary of what is requrired to be produced to create a “good” stereo pair, a program that compares stereo different depth cues to stereo, and a list of depth cues with a ranking comparing it to the stereo depth queue (i.e, better or worse).

Initial Readings:

I have three initial readings that will hopefully produce leads on other literature:

  1. Michael F. Deering. Making Virtual Reality more Real, Experience with the Virtual Portal. Proc. Graphics Interface ’93.
  2. Michael F. Deering. The Limits of Human Vision. In 2nd International Immersive Projection Technology Workshop, 1998.
  3. Geoffrey S. Hubona. The relative contributions of stereo, lighting, and background scenes in promoting 3D depth visualization. ACM Transactions on Computer-Human Interaction, 1999.

Timetable:

By the following dates I would like the following to have occured

April Friday 23 – Read initial papers and any leads, have summaries and have initial code in place to produce stereo images on csl machines

April Friday 30 – Have stereo/depth cue comparison program done and have initial measurements from different people.

Example Visualizations:

I will produce hard copies of the images produced by my software.  In most cases this will be a simple image constrasting two depth cues.

Project Plan (ET)

April 16, 2010

in Final Project

I will be trying to create a tool to visualize Electroencephalography (EEG) data.  The specific experiment is measuring brain response to repeated auditory stimulation interspersed with periodic rest states.  Eventually, the brain response gets into phase with the auditory stimulation (a series of clicks).  However, what is difficult to show is the time it takes the brain to get into phase with the sounds and whether that varies over time.
I would like to produce a visualization which demonstrates the differences in time it takes to get into phase, which itself will vary over time.  I also want to produce a visualization which can compare between multiple patients.
The initial readings are:

Lin, Jessica et al. Visualizing and Discovering Non-Trivial Patterns in Large Time Series Databases. Information Visualization, 2005, 4(2), 61-82.

Marc Weber, Marc Alexa, and Wolfgang Muller.  Visualizing Time-Series on Spirals. Proceedings of the IEEE Symposium on Information Visualization, 2001, 7.
Cutting J E, 2002, “Representing motion in a static image: constraints and parallels in art, science, and popular culture” Perception 31(10) 1165 – 1193.
I am currently looking for more readings on representing large time series.
By April 23, I hope to have a basic structure for reading in the information and displaying the waves that represent the data.  I will also have scraped the data for the relevant information needed for my visualizations (eg peaks).
By April 30,  I hope to have an a few initial visualizations completed as well as an analysis on how well they seem to work.
By May 6, I should have a useful visualization and will have developed a way to compare between multiple patients data.

The first visualization I will make is the basic (somewhat obvious) one of wave forms.  I hope to then create one which will isolate the peaks of the waves, and draw connections between them.  It make be worth looking ,brat the slope between these connections.  Since the stimulation is periodic in nature, I am also considering testing the spiral method as described by Marc et al.

My data will be provided by the University of Pennsylvania Brain Behavior Laboratory.  It is EEG data obtained while a subject undergoes 40 seconds of repeated half second auditory stimulation (a series of clicks).  Which is interspersed with longer rest periods.  There will be both raw data and averaged data, from multiple electrodes.  This data will also be from multiple patients.

Project: VAST 2010 Mini Challenge 2

David He, Faisal Khan, Ye Liu

Introduction

For this challenge, we have been given hospital admittance and death records for cities involved in the major epidemic outbreak. There are two main tasks. One is to characterize the spread of disease by considering attributes like symptoms of the disease, mortality rates, temporal patterns of the onset, peak and recovery of the disease. Secondly, we are asked to compare the outbreaks across cities as well.

Initial Readings

We will try to add more readings here before Friday. As for now we know that we will be going through this book –  “Illuminating the Path”. It is freely available as an e-book.

Milestones or Weekly Goals

Week 1: This week our main goal is to get familiarity with the dataset. We are planning to write small tools for assisting in understanding things like mortality rate over time, leading syndromes for mortality. For this prep-processing phase, we need to do some text processing because of the abbreviations and aliases used for name of the syndromes. Also we will be making simple plots to analyze above-mentioned variables. For this purpose, we have imported all the data in sqlite database and soon going to start running queries and do some basic charting to analyze the data. We are using python with gnuplot for this purpose.

Additionally, we will do some handsome amount of readings. Mostly we are targeting reading parts of the book on  “Illuminating the Path” to get some understanding of the idea behind visual paralytics. We will try to find some papers on epidemiology that are relevant to our project.

Week 2: During this week we will focus on coming up with multiple visual designs that answers the first question of the challenge i.e. characterize the spread of the disease. We are hoping that we will not have to do a lot of implementations for evaluating these design. Perhaps, use of some mock-up designs or static images might help us narrow down choices. We plan to begin by getting concrete understanding of how domain experts study spread of diseases. We will look into existing visualization tools and some literature dealing with the understanding of the spread of disease. If we get access to domain expert, we will seek his/her help on this topic as well.

We expect to finalize the development environment for our project by this week.

Week 3: By this week we are hoping to finalize a design for visualization the spread of the disease. Among different other criteria that we will consider for finalizing  the design, one important one will be the suitability/extensibility of this design to answer the second part of the challenge i.e. ability to compare disease spread across different cities. We will spent most of the time on implementations and remaining on doing some readings on the evaluation of visualizations.

Week 4: Finish implementing the visualization. Begin evaluating the visualization.  Write the report and present.

Individual Responsibilities

All three of us will be involved in design of visualization and determining whether the design resembles expectation.  We are planning to meet several times during next coming days. We expect to use these joint session for helping each other decide on readings and implementation, brainstorming ideas and planning next actions. David and Faisal will be doing most of the implementation. They will try to divide it equally among them. Ye (Alex) will be able to read their code and use any of the tool (custom built or off the shelf) for experimentation and analysis. He will do more readings for equal contribution.

Out Expectations

Applying the methods learned in the class to solve a practical problem.

Being able to answer the questions posed in the challenge.

Understanding of the process involved in the design of visualization and the evaluation of it.

Knowing how to handle large data sets in visualizations.

General understanding of opportunities, challenges involve in the field of visual analytic.

Hopefully interacting with some domain expert (epidemiologist) to gain insight into how the epidemics are studied.

Non-linear zooming and Clique highlighting in Network Visualization

Team: Chaman, Nakho, Jeeyoung

Overview

Our group project aims to build an improved network diagram visualization which includes new effective measures for removing data cluttering and highlighting complete subgraphs(“cliques”). Discussing the many challenges in network graphs, we decided that the most basic and concrete problem to improve on will be how to make pattern finding in large datasets easier. Also, we wanted to handle the issue of cliques, which has been widely regarded to provide qualitatively distinct communication patterns.

To remove data cluttering we will explore the possibilities of non-linear zooming, while for visualization of cliques we will focus on how to assign effective visual cues to the subgraphs to make them stand out in the large network graph.

We will be using two distinct network data, one from the social science domain and the other from natural sciences. The first dataset is the coeditor network of the Wikipedia. Wikipedia is an popular open Internet-based collaborative platform to compile all fields of knowledge in a encylopedia format, currently hoding more than 3 million articles. Anyone can write and anyone can edit what anybody else has written. By examining on a large scale which two or more users have written and edited a common article together, we can see how specific knowledge production networks emerge in an open structure.

The second dataset is the protein interaction network of yeasts. Protein-protein interaction reveals functional control of target proteins as well as molecular complexes. This protein-protein interaction data can be in the form of binary interaction or group of interaction. Also, those data can be from several different experiment methods. First, we visualize those interaction information from several methodologies in an efficient way. We cliques overlapping links to have high-confidence interactions. Second, we visually integrate protein-protein interaction with known pathway or genetic interaction. Then, we can investigate the relationship between protein-protein interaction, pathway, and genetic interaction.

Our team is composed of members of different academic backgrounds, which will provide a significant merit if the subtasks are properly allocated. Chaman will be focusing on algorithm implementation, Nakho will provide the Wikipedia data and work on design aspects, and Jeeyoung will provide the protein data and process the raw material into network data. Other subtasks including the literature review will be done collaboratively.

Project schedule(tentative)

Week 1 (~Apr/16) : Planning and compiling the initial literature
Week 2 (~Apr/23) : Literature review, preprocessing of data, analyzing algorithms of existing software (e.g. H3viewer, HypViewer, LGL code), Progress report #1
Week 3 (~Apr/30): initial implementation of improved visualization tools, Progress report #2
Week 4 (~May/6 11am): finalizing tools, presentation, final report

Literature to review (tentative)

– Frishman, Y. & Tal, A. Online Dynamic Graph Drawing IEEE Transactions on Visualization and Computer Graphics, 2008, 14, 727-740
– Frishman’s thesis
– Frishman, Y. & Tal, A. Multi-Level Graph Layout on the GPU IEEE Transactions on Visualization and Computer Graphics, 2007, 13, 1310-1319
– Frishman, Y. & Tal, A. Dynamic Drawing of Clustered Graphs Proc. IEEE Symposium on Information Visualization INFOVIS 2004, 2004, 191-198
– deMoll, S. B. & McFarland, D. A. The Art and Science of Dynamic Network Visualization Journal of Social Structure, 2006, 7
– C. Collberg, S. Kobourov, J. Nagra, J. Pitts, and K.Wampler. A system for graph-based visualization
of the evolution of software. In SoftVis ’03: Proceedings of the 2003 ACM symposium on Software
visualization, pages 77–ff, 2003.
– M. Freire and P. Rodriguez. Preserving the mental map in interactive graph interfaces. In AVI ’06:
Proceedings of the working conference on Advanced visual interfaces, pages 270–273, New York,
NY, USA, 2006. ACM.
– Michael Kaufmann & Dorothea Wagner. Graph Drawing: Methods and Models. 2005. (selected chapters)
– http://www.leader-values.com/Content/detail.asp?ContentDetailID=975 social groups visualization which focuses entirely on design techniques
– http://www.ibluemojo.com/school/clique_algorithm.html algorithms to calculate cliques efficiently
– http://portal.acm.org/citation.cfm?id=860151 algorithms to calculate cliques efficiently
– http://www.cytoscape.org/ Visualization tool used in biological data
– “A survey of visualization tools for biological network analysis” by Pavlopoulos et al. (2008)
– “Visualization of omics data for systems biology” by Gehlenborg et al. (2010)

If you have a great narrative, all the visualization tool you really need is… some Lego bricks.

http://www.gapminder.org/videos/population-growth-explained-with-lego/

Project Plans

April 13, 2010

in Final Project

By Friday, April 16th, you need to have posted a project plan. The original idea was that you should post one earlier and we would iterate over it to refine it. I hope you will still do that – when you have something together, post it and send the instructor and TA email and we’ll get you some feedback. The goal is to find things that are good topics, but also where you can do something meaningful in the short time.

Each proposal should have:

  • A description of the topic
  • A description of the desired outcomes: what do you expect to produce from the project (a program, an experiment, a document, etc)
  • An initial reading list: what are the first few things you plan to read on the subject. If you need help in identifying relevant readings, ask for help.
  • A timetable: what do you expect to have achieved by the milestones
  • A description of the example visualizations you will make (remember, even if you write a survey or build software, you must actually do visualization on some data). Please tell us about your data sources.

In general: ask for help. Talk to us as much as possible. We really want to help guide you towards doing something cool and interesting.

Animation Reading Posted

April 12, 2010

in News

Again, just linked to keep the news page clean.

(assignment due Thursday 4/15)

Now for something completely different: we’re going to talk about the (artistic) principles of animation. This might seem a little off-topic. However, knowing these principles is really useful in using motion for visualization. Plus, its more fun than some of the other topics. And its tax day, so I need something fun to cheer me up.

You need to read one of the “principles” readings, and the “animated transitions” reading (at the bottom). Then comment on how you think this might relate to other things we learned in class. (the 2nd one is what I recommend, but you might pick 2 and 3)

The classic reference for the Principles of Animation is “The Illusion of Life” – a book about the history of Disney animation. It’s a coffee table art book – not necessarily something meant for either animators or computer scientists to learn from. But it is fabulous, and full of great examples from classic Disney films:

  • Johnson and Thomas. Disney Animation: The Illusion of Life. Several editions (Aberville Press, 1981 is the “original” I think). Chapter 3:The Principles of Animation. (26MB download)

Because so many artists wanted this book, it has been reprinted many times (I own 3 different reprints). Curiously, one of the editions is more focused on teaching artists. In this version, Chapter 1 is the principles (very similar to Ch3 in the original). The preface is a good introduction to animation pre-“Principles” (which is good for understanding them). And Chapter 2 is a great summary of how they made the movies (irrelevant for class).

John Lasseter was a Disney animator who went to work with a small company of graphics hackers. The company grew and grew and grew and now everyone knows Pixar. His SIGGRAPH 1987 paper was a seminal work where he introduced the graphics world to the principles of animation. The basic content is the same as the Johnson and Thomas chapter, but its more condensed, and the examples are from Pixar films.

  • John Lasseter. Principles of traditional animation applied to 3D computer animation. SIGGRAPH 1987. (acm site with PDF). Note, there are many summaries of this paper on the web. Here’s one by a well-known animator. But do read the original. (well, you’re even better off reading a Disney thing first, then reading this for historical context).

Now, you might wonder “what does this have to do with visualization.” One answer (and this is only one of several) can be seen in:

  • Jeffery Heer and George Robertson. Animated Transitions in Statistical Data Graphics. InfoVis 2007. (project page – I strongly recommend watching the movie as it is well done. you might not even need to read the paper)

In your comment, say which things you’ve read, and your thoughts on the roles this might have in the kinds of things we discuss in class.