Design Exercise 1: Try Tableau
For our first “at home” design exercise, we want to make sure that you have at least tried Tableau.
The requirements are pretty minimal: all you need to do is make 2 visualizations, each one using different data. We don’t ask for anything beyond this.
You probably want to spend some time figuring out how to do things with Tableau since some real exercises with it are coming up.
For information on Tableau in class (including getting access and why we are using it) please see Tableau.
We will use Tableau for some assignments coming up. We want to make sure that everyone at least tries it out now. You should at least address the access issues (how are you going to use it), and hopefully also sort out some of the complexities of using it. We don’t ask you to do much now, but hopefully you will choose to - either because you are interested in it (Tableau is quite interesting, in many ways), or just out of fear of needing to use it in the future.
The handin mechanism is a bit of an experiment: we will use a Canvas Quiz and “upload file” questions for you to submit your visualizations.
The quiz is: Design Exercise 1: Try Tableau! (due Mon, Sep 27)
We ask that you upload two visualizations, each from a different data set. We ask you if you were trying to show something in your visualization (it’s OK if not, but you might want to try making visualizations where you have something in mind).
We will provide some data sets for you to experiment with, but for this exercise you can use any data you want.
This is due on Monday, September 27th. You might want to do it sooner, so you have more time to practice with Tableau. We will accept late things until October 3rd (you need to do it before the next assignment on October 4th).
Data Sets to Try
Note: these data sets are for Design Exercise 1 and 2 - they are not the data sets for the larger parts of Design Challenge 1.
Some of these data sets we decided aren’t “rich” enough for the big design challenge: it isn’t obvious that you can tell enough good stories from this data set. They are all good for practicing. One or two of the bigger ones might re-appear for DC1.
For DE1, you need to pick 2. For DE2 you need to pick one (it can be one of the ones you picked for DE1).
To make things more convenient for you, we will provide all of these as Tableau Online DataSets - so you don’t need to upload them yourself. If you aren’t using Tableau online, you will need to get your own copy of the data set. We’ll post instructions on how to access our uploaded versions somewhere else.
The data files (as CSV) are all in a Canvas Files folder
Pizza (Barstool)
- documentation:
https://github.com/rfordatascience/tidytuesday/tree/master/data/2019/2019-10-01
- this has other data sets as well
- https://github.com/rfordatascience/tidytuesday/raw/master/data/2019/2019-10-01/pizza_barstool.csv
- 464 lines
- might not be enough to tell stories, but good enough to experiment with
- documentation:
https://github.com/rfordatascience/tidytuesday/tree/master/data/2019/2019-10-01
IMDB 5000 Moview (Kaggle Data Set)
- originally from Kaggle
- this was an option for the overall challenge in previous years
- there are 5000 movies (rows), and a good number of attributes of each movie
NYC Air BNB Data
- originally from Kaggle
- 49K rows, 7MB
- this is big enough to be interesting, but the set of features may be too limited to tell interesting stories
2017 President’s Budget (last year we have the data easily available)
- these datasets have historical trends (nice to looking at multiple series) and part/whole relationships
- lots of documentation: https://github.com/WhiteHouse/budgetdata/tree/2017
- Receipts (small - 245 rows) GitHub
- Budget Authorization GitHub
- Outlays GitHub
Spotify Music
- 12 continuous features, a few catergorical features, plus names
- 32K songs, 8MB CSV file
- features well documented in GitHub ReadMe
- GitHub Repo w/README
- CSV File
- This dataset is big enough to be interesting… but it isn’t clear that good stories will emerge.
Students Choice
- I wasn’t going to allow students to come up with their own data sets. But, for this exercise your can. You may only do this for one of your two DE1 data sets. And there are rules.
- The rules you must follow:
- The dataset must be publicly available. You must be able to give a URL where someone can find it publicly. Other students must be able to use it.
- You must announce it (so that other people can use it as well). Make a posting on Piazza giving the URL where someone can find it on the web, and a brief description.
- Getting it into Tableau is up to you. Generally, this is easy (Tableau is pretty smart about uploading data), but you might need to do some “cleaning”.
- This is only for DE1 - not DE2.
- You may pick a data set that someone else has already posted on Piazza (you don’t need to post again).