CS765 – Data Visualization – Spring 2017 — Course Web for CS765 Data Visualization, Spring 2017

Discussion Assignment Timing

by Mike Gleicher on February 15, 2017

I realize that I have been inconsistent in explaining the timings for the discussion assignments (these are the weekly discussions connected to the readings). Because of this some things on Canvas got set up wrong (sorry if you got shut out of Assignment 4 – it is open again if you want to continue discussion)

The weekly timings are:

Monday – before class – readings are due
Monday – any time – first posting due
Friday – any time – other required postings due (although, some weeks might have “special” things due earlier)
the following Friday – discussion closes.

So, if you’re staying on track… (i.e., are on time): you have made a first posting by the end of day Monday, made any other required postings by Friday, and made at least 2 comments in response to other postings by Friday. Then, there’s another week for you to continue the discussion (and in many cases, there are interesting discussions going on).

So for example, this week… Assignment 5… initial posting was due on Monday (2/13), the additional posting (1-2 more checks of abstractions) are due on Friday 2/17, and you need to have contributed 2 additional things (replies to other people) by Friday as well. We’ll let you continue discussing things until 2/24. (you can turn in late things until 2/24 as well).

I have been reading a lot of the discussions, although I haven’t been taking part in the conversations or giving feedback. I’ve seen lots of really thoughtful postings, and lots of interesting discussions.

Piazza

by Mike Gleicher on February 14, 2017

Several students have asked about Piazza. (for those of you not familiar with it, it’s an online discussion tool specifically designed for University classes).

I was hesitant to add Piazza – since it means we have 3 different web things (the course web site, Canvas for assignments, and now Piazza). And it isn’t fully supported by the University.

But, Piazza is really good for free-form student to student discussions (e.g. asking questions, discussing things that come up, …). Canvas can do this, but it’s not so great. So I decided we’ll try it. If it fails, we’ll cut it out.

As we start getting to the Design Challenges, I think Piazza will be a great place for students to discuss tools, ask for help as you try to figure out how to make stuff, …

All “Official” discussions (e.g. assigned discussions that are graded) will still be on Canvas. Canvas is much better for keeping track of stuff, breaking the class into groups, turning in assignments, etc.

There should be a Piazza button in Canvas. According to the University Q&A, if you use this to access Piazza, it should automatically enroll you in the class.

If having 3 different online systems is confusing (4 if you count email), here’s a way to keep it straight:

The course web page is the way that the course staff broadcasts information to the class. All announcements will be here.
Canvas is the way that we will turn in assignments. Since some assignments are discussions, these discussions will be on Canvas (e.g., the weekly assignments, Seek and Finds). If it counts towards your grade, it will be on Canvas.
Piazza is for public discussions between students (possibly with course staff involved). If you want to talk about something that isn’t an assignment, you can start a discussion here.

Design Challenge 1: One Data Set, 4 Stories

by Mike Gleicher on February 13, 2017

Due Dates:

Kickoff Meeting: February 24th (Friday, Optional Class)
Data Set Selection: February 24th (All data sets must be approved)
Rough Drafts: March 7th (on Tuesday, and bring to class Wednesday)
Designs Turned In: March 14th (on Tuesday, and bring to class Wednesday)
Written Critiques: ~~March 29th~~ March 31st

Designs online! http://graphics.cs.wisc.edu/Courses/Visualization17/design-challenge-1/

Objectives: To make some visualizations with real data, and to explore how to tell different “stories” by choosing different encodings of the data. This is a chance to try out using visualization tools. If you’re an 838 student, you’ll get to try your hand at dealing with a larger data set.

Overview

In this assignment, you’ll pick one data set to make visualizations from. Then, you will make 4 visualizations – each telling a different “story” about the data. Then you will also make a 5th visualization that re-tells one of the stories from the first 4. The idea here is that you should explore the different kinds of visualizations you might make from this data, and the different questions/tasks that you might want to show someone, and to see how you can match the picture.

We will provide a bunch of choices of data sets. We will check to make sure they are sufficiently challenging (there are good stories in them), yet not too hard in ways unrelated to the class (e.g., they need extensive cleaning or specialized science to interpret them). We encourage you to pick one of our data sets.

For this year, we will allow people to “bring their own data set” subject to a bunch of rules. The data set must be publicly available, must be on a topic of general awareness (i.e., not something that only researchers in a specialized field care about), and must be sufficiently challenging to work with. In order to use a data set not on our “approved” list, you must get our approval. We will have a “bring your own data day” (Feb 24) where you can bring your data set for public critique (and possible approval). If your dataset is approved, it will be added to the “list of approved data sets” so that anyone in class can use it. No new data sets will be approved after Feb 24.

You may use any tools that you like to create the visualizations – subject to the constraint that you are required to hand in PDFs, and to document your process. It is fine to use Excel or Tableau or R or JMP or some other “tool.” It is also fine to write your own programs that create visualizations in whatever programming language you like. There may be practical issues in getting pictures our of your own programs – at worst, you can use screen capture. We will have class sessions where we show off some tools you might want to use, as well as using Friday class time for “help sessions.”

For rough drafts (due March 7th) sketches are fine – the goal is to get feedback from others. We encourage you to do a lot of sketching to try out different designs (even if we can’t give feedback on all of them).

For the final ones, you should make real visualizations with the real data.

If you find that you aren’t able to exactly implement your design (e.g. you can’t figure out how to convince excel to use the colors that you want), feel free to “cheat” a little (save the picture and open it in Photoshop and paint over it), but part of the idea is to try to make pictures with real data (so don’t just sketch – unless you are doing precise measurements). If you’re really stumped on implementation, you can put a note in your caption “the red dots were supposed to be blue” – but try not to leave too much to the imagination of the viewer.

On Tuesday, March 7th, you will upload at least 2 sketches (either as PDF or image files) to Canvas, and tell us about which data set you are using. You should bring printouts of 2 designs to class – we’ll take some time to critique each others work.

On Tuesday, March 14th, you will turn in your “final” visualizations (at least 5 – since for one of the stories you need to make 2 visualizations). For each visualization, there should be a good caption, explaining the data and enough of the story. Although, if your graph is really great, the reader might figure out the story without reading the caption. Please do not put your name inside the PDF (so that we can send them out for anonymous critique). The PDFs should be 1 page each. it should be clear from the visualization and/or caption what data set it is. Turn in a 6th document that explains how you made the pictures, and what you were trying to show with each one. These will be turned in as an assignment on Canvas.

~~Please bring a printout of at least one of your visualizations to class on Wednesday, March 15th. We’ll do some in class critiquing. (although, it is too late for you to change your visualizations).~~

Shortly after March 15th, we will send each person a few visualizations to critique. We will grade you on your critiques. We will also provide the critiques back to the designer (note: student critiques will not determine the grades for the designs – the class staff will grade them).

How to do this?

We are explicitly not specifying how you should make your visualizations. Given the range of skills of students in the class, there isn’t one tool for everyone.

Our main interest is in the results. Good results are visualizations that effectively tell the stories they are trying to tell. How those visualizations are made is less important than how well they work. Well-chosen, basic charts can often tell interesting stories, but we would like you to try to tell richer, more complex stories.

We do encourage you to use this assignment as an excuse to learn about new and different tools. We intentionally added some extra time at the beginning of the assignment for people to do this. That said, this isn’t a time to go overboard: if you’ve never programmed in JavaScript before, now might not be the time to master D3. But, it might be a chance to try out Tableau – even if you decide to make your final pictures some other way.

Part of this assignment will require you to do some quick looking over the data set to see what stories are there – this is “exploration” (in statistics, they might call it Exploratory Data Analysis). The tools you use for this kind of exploration might be different than those you choose for making your final pictures.

We’ll have some in-class sessions to help you get your handles on tools:

Wednesday 2/22 – we’ll show off some ways you can use excel to quickly explore data and make some pretty nice charts.
Friday 2/24 (optional class) – while the focus of the “kickoff” event is to discuss data, part of this will include looking at data, which will require us to practice using tools.
Wednesday 3/1 – We’ll give a little bit of a getting started guide to Tableau.
Friday 3/3 (optional class) – this “help session” will be a chance to ask questions about tools. If we can’t help you, maybe a classmate can.

Data Sets

You can choose any of the data sets on (link coming).

If there’s a data set you want to see on the list, submit it to us (and bring it to the optional class on February 24th). If we agree it’s good for the assignment, we will put it on the list for anyone to use (including you).

Examples

We have a similar assignment in 2015 (assignment posting). The data sets were a bit different, and the class was half undergrads (it was a 638). But you can see the results here. (you can’t follow the links to see anything other than the thumbnails – but it can give you a sense of what kinds of things students did in the past).

Data and Example Questions

Try not to pick questions that can be answered with a single statistic – but something where the visualization adds value. The richer and more complex the task the story (or sets of stories) that the visualization tells makes it more interesting (and challenging), and gives you more opportunities to make a particularly cool “story”.

For example with the airline data (a month of flight delay information):

You could give the statistics on the average delay for flights leaving Madison
You could give the statistics on flight delays leaving Madison, helping someone choose which destination has the least delays, or what time of day you are most/least likely to have a delay, or some combination of both.
You could present information on a bunch of city pairs – for example, to help someone plan a trip between Madison and San Francisco, which hub city is it best to connect through? what time of day should you leave? (if your goal is to avoid delays)

We’ve picked the data sets (but you get to choose amongst them). You get to pick the stories to tell. Think about stories that someone would care about. Stories that would be interesting.

Grading / Turning Things In

Choosing a data set: we will not require you to tell us what data set you are choosing before the rough drafts are due. However, we recommend that you pick your data set early. If you want to have us add a data set to the list of data sets, follow the procedure above – but asking us to add a data set to the list will not impact your evaluation.

Rough drafts: due Tuesday, March 7th. Upload (at least) 2 PDF files (or other image files) to Canvas. And bring at least 1 on paper to class on March 8th. These will be graded check/no check (i.e., nothing/something/acceptable x late/not late – on our weird numeric scale). We may provide some feedback, but mainly we want to make sure you’re working on the assignment. We will take some class time to have people critique each other’s drafts.
In addition to uploading your PDFs of 2 designs, in the type-in comments, please tell us which data set you are using (although, it should be obvious from the vis), and what story you are trying to tell (optional if the vis is self-explanatory).

Designs Due: due Tuesday, March 14th. This is the “main hand in.” This will be turned in as an assignment on Canvas (LINK available in the future).

You need to upload 5 designs (4 questions, 2 designs for 1 question). You may submit 1 or 2 extras. Each design should be a separate PDF file, and be self-contained with a caption. However, it should not have your name on it (so we can send it out for anonymous critique).

As an additional document (either as a PDF or in the Canvas type-in box), explain how you made the pictures, and the questions that each is meant to address (hopefully it will be clear from the vis and caption). Your peer reviewers will not see this document, but the grader will.

We encourage (but not require) you to turn in a ZIP file with the “implementation” of your visualization. In particular, if you wrote software yourself we will be curious to look at it (just the source). It won’t count against you, but if you did something particularly clever, please show it off! (whether its clever scripting, or amazing use of excel or …)

The course staff (probably the TA and the instructor) will assign a grade (unclear if we will use a numeric scale or an A-F scale). The grade will be for the quality of what is turned in (other parts of the assignment, and penalties for being late will be added later).

The things we will consider include:

How good/interesting are the “stories” that you chose? Did you pick a diverse set? Are the things you chose to show multi-variate?
How well chosen are your encodings? Are they effective at communicating the message?
How well “implemented” are the designs? Are the specific detail choices made thoughtfully?

Visual appeal and implementation (beyond what is required for effectiveness) may be rewarded, but are not central.

Note: if your assignment is too late, we won’t be able to send it out to get peer reviews. Also, we won’t give you things to peer review until you submit your own.

Peer Reviews: due ~~Wednesday, March 29th~~ Friday, March 31st. Shortly after the 14th, we will assign each student a few of their classmates designs to critique.

We will grade the critiques (so you need to write good critiques!) and give the feedback to the designer, but the critiques will not be used to determine the grade for the design. ~~The exact mechanism for critiques will be provided later, but it will be graded just like discussions.~~

We have sent all students an e-mail with their assigned critiques. Please fill out the critique form for all of your assigned critiques for full-credit.

Design Challenge 1: Data Sets

by Mike Gleicher on February 13, 2017

These are the “approved” data sets for Design Challenge 1. Remember, you must use one of these approved data sets. If you want to use a different data set, you must get it approved (and we’ll put it on this list).

This list is in no particular order. The datasets are available in this Box folder.

Metropolitan Area Population Change

Note: this data set is small / easy. If you pick this one, the expectations for what you will need to do with it are much higher. I really dislike the vis on the census bureau website, you should do better (from the visualization, you can link to the data table). But the data is too small, and I’m not sure how many rich stories are to be found in it.

White House Budget Data

The data used in developing the budgets (back in 2016 and 2017). From the White House github. I recommend going to the 2017 branch and selecting “download ZIP” (look for the green “clone or download” button). There is good documentation, and the data is quite rich – giving historical spending in a lot of categories.

In the past, we considered the “receipts” data as small, and the “budgets and outlays” as harder data sets. Here we’re grouping them together.

Airline On-Time Peformance

The Bureau of Transportation Statistics lets you download a lot of data, one month at a time from this page. We’ve downloaded a few months for you – but even if you download our versions, you might want to refer to this page for explanations of all the fields, and look up tables (files that say what the codes mean).

For this data set, you may choose to use the months we downloaded, or download your own (please specify what data you use). You can choose to use just 1 month, or you can pick multiple months to compare (if you want a real challenge).

Nationwide Crime Data

One of the functions of the Federal Bureau of Investigation (FBI) is to compile crime statistics within the US and use this information to help local law enforcement to curtail crime. Every year, the FBI releases this data along with recommendations for communities to stem violent crime. We have downloaded the 2014 year dataset (as well as 2015) of types of crime by area, available on Box.

If you use this dataset, we ask that you resist ranking cities/states or their law enforcement capabilities by their crime, as requested by the FBI. Showing trends and patterns should be your goal here.

Census Data By County

You can get census data in all kinds of forms. This page has 4 spreadsheets. Any one of them could tell an interesting story – but you probably want to put together multiple files. The complication is that it’s a long list of counties (you might just pick some, or try to give a sense of the range of what is going on, or identify unusual things, or …). The files are also in the Box.

The files are:

Population Estimates – has data 2010-2015 (per year) with inflows and outflows. There is a seperate sheet in the excel file that explains the columns.
Education – has data from multiple years (1970, 1980, 1990, 2000, 2015) for different levels of educational attainment.
Unemployment – has data from many different years
Poverty Estimates – mainly 2015 data, explanations for the columns in a separate sheet.

Time Usage Survey

The American Time Usage Survey (ATUS) tracks how people spend their time. There are corresponding international versions. There are actually lots of different surveys with interesting data available from the IPUMS website.

Getting a data set requires picking from all the options. And you can probably pull together an interesting data set in many ways. I grabbed one from the site. I also checked that, despite the scary agreements I had to agree to, sharing it with a class is legal (see this), so I put a grab of how Americans time usage has changed over the years into DataSets Box folder.

You can find out what the “time use codes” mean on this page.

Interpretting the other codes requires some digging, unfortunately. Some are self-explanatory, but others… I tracked down the “FAMINCOME” columns: explanation here. The state codes are here.

Student Contributed Data Sets

Beijing Air Quality Data

2 Data Sets about Air Quality in Beijing, joined into a single cohesive table.

From the contributor:

The data comes from two sources:

Air quality data: http://www.stateair.net/web/historical/1/1.html (need to download each .csv separately)
Weather data: https://www.wunderground.com/history/airport/ZBAA (here’s a link to a .csv for 2011)

I first pulled the air quality data (where measurements are taken multiple times a day), and aggregated to be at the daily level. Then I merged the weather data to the air quality data. I have a GitHub repository with the data and R and Python code.

Note: the github repo not only has the documentation for the data, and the data conveniently processed into a CSV file, but it also has code for some basic visualizations. I can’t stop you from looking at the code. But, if you are not the author, you cannot turn in these visualizations.

UN Refugee Data

UN-Link: http://popstats.unhcr.org/en/asylum_seekers_monthly

Assignment 9: Interaction

by Mike Gleicher on February 12, 2017

Due Date: Initial Posting Due 3/13, discussion closes 3/31 (a little longer than normal because of Spring Break), but please have your required postings done by 3/17.

Handin: This will be a discussion on Canvas

Reading: Reading 9 (due 3/13)

For discussion, talk about the different ways you’ve seen for using interaction to address various challenges in creating effective visualizations. (your initial posting should be a summary of ideas, and then at least 2 additional postings as part of a conversation to understand what interaction can do, what it can’t, and the pros and cons of using it).

Assignment 8: Color

by Mike Gleicher on February 12, 2017

Due Date: Initial postings Monday 3/6, discussion open until 3/17 – but have the minumum postings done by 3/10

Turn in: Discussion on Canvas (coming soon – around 2/27)

Reading: Reading 8 Color

There are a lot of different topics to discuss about color, and we’ll be mashing them into one Canvas discussion. So, the minimum requirement is 4 postings (the initial posting, the second conversation starter, and at least 2 additional postings in response to others as a “discussion”)

For the initial posting (due Monday 3/6): Discussion how all the different aspects of color from the readings and class (perception, physics, art, reproduction, semantics …) can influence what we need to do as Vis designers.

For the second “conversation starter” – pick a “favorite” color ramp from Color Brewer and explain what it might be used for, and what it should not be used for (e.g. a different ramp would be more appropriate).

If you want something else to look at, check out Colorgorical – as a discussion point, you might provide a favorite ramp you created with it and compare it to what Color Brewer gives you.

The topics of the week should give you plenty to discuss.

Assignment 7: Perception

by Mike Gleicher on February 12, 2017

Due Date: Initial Posting Due 2/27, further discussion over the next week (cutoff after 3/10)

Turn in: Via Canvas Discussion

Reading: Reading 7 Perception 101 (Due 2/27 and 3/1)

Hopefully, the readings will give you a crash course in visual perception. The goal of this discussion is to get you to think about what you’ve read, and start to build the connections to visualization.

There’s a different discussion to have about “graphical perception” and how those papers inform our visualization practice – we’ll have that conversation later.

For this week, the discussion should be about human perception and how the facts of perception. We want everyone to post two “conversation starters” – and then at least 2 more postings in response to others. (so this week, there is a minimum of 4 postings). Be sure to make your postings early enough that others can read and respond.

In your initial posting, give some examples of how the mechanisms of the visual system give rise to efficiencies and inefficiencies that can be considered to create effective visualization designs (and graphic design more generally). Can you connect facts about the visual system to design principles we’ve seen in class?
In a second posting, we are curious what “fun facts about how the human visual system works” you found surprising (or at least interesting). If you’ve taken a perception class before, you’ve seen all this, so maybe nothing was surprising this time around – so say what you found surprising the first time around.
As a conversation topic, try to build connections between this knowledge of how vision works, and what we might do for design.

Seek and Find 9: Something Interactive

by Mike Gleicher on February 12, 2017

Due: Friday, March 17th (discussion will stay open through March 31st – since break is in there)

Turn in Link: Discussion on Canvas

Normally, we prefer you to find visualizations that are not interactive. But since this week’s topic is interaction, we want you to find good uses of interaction in visualization.

You still need to post a static picture of the visualization – but you also need to have a link to the interactive version. If you think about it, a good submission will be something that the static picture doesn’t really do the visualization justice.

In your description, be sure to describe what the interaction is useful for – how does it help? What challenges are addressed with interaction? Why is interaction useful for this visualization?

Seek and Find 8: Something Colorful

by Mike Gleicher on February 12, 2017

Due: Friday, March 10th (discussion open for a week afterwards so you can discuss)

Turn in: Link on Canvas

Since we are learning about color this week…

For this seek and find, your task is to find a visualization where color is used well.

In addition to your image, provide a critique of how color is used. What is color being used for in the image? Are the color choices well-justified?

Seek and Find 7: Standard Designs

by Mike Gleicher on February 12, 2017

Due Date: Friday, March 3rd (discussion open for a week beyond that)

Turn in Link: Discussion on Canvas (will be available around Feb 27)

In this seek and find, you are tasked with finding a visualization that is based on a “standard” design. It should be a type of chart you’ve seen before – preferably one that has a name (from playing with Tableau or doing the third part of Reading 5 you should know the names of lots of common chart types).

We specifically want you to find “good” examples (despite being based on a standard design). A visualization that seems appropriate for the data and task.

As usual, provide a picture and a link. In your description, please explain:

What is the “standard design” that it is based on?
How appropriate is the design to the form of the data (i.e., the data abstraction)?
How appropriate is the design to the tasks? (or, what tasks is it appropriate for)
Why do you think this is an effective visualization.

This will be good practice for Design Challenge 1, where you will likely be using standard charts (since you’re most likely making them with standard tools).

← Previous Entries

Next Entries →

Course Web for CS765 Data Visualization, Spring 2017

Discussion Assignment Timing

Piazza

Design Challenge 1: One Data Set, 4 Stories

Due Dates:

Overview

How to do this?

Data Sets

Examples

Data and Example Questions

Grading / Turning Things In

Design Challenge 1: Data Sets

Metropolitan Area Population Change

White House Budget Data

Airline On-Time Peformance

Nationwide Crime Data

Census Data By County

Time Usage Survey

Student Contributed Data Sets

Beijing Air Quality Data

UN Refugee Data

Assignment 9: Interaction

Assignment 8: Color

Assignment 7: Perception

Seek and Find 9: Something Interactive

Seek and Find 8: Something Colorful

Seek and Find 7: Standard Designs

Archived Web Site!

Recent Posts

Categories

Useful Links

Archives