Dc2 Proposal Codes
I tried to provide quick feedback on the proposals - unfortunately, I wasn’t able to spend much time with them, but hopefully I can give something quickly. In order to cram things into Canvas Comments, I coded things. I am showing the codes to everyone - a comment that was given to someone else might apply to you. Indeed, some of the advice applies very generally.
For scoring: if you filled in the questionaire, you got a 6. If your proposal was really missing something big (many proposals missed smaller parts), you might get a 2. If your proposal was well put together, it might get an 8 (I erred on the side of not giving this enough). (I couldn’t be consistent about this, so I didn’t give 8s even in the xtreme cases - but those that deserve it probably know - your effort now will pay off since it will help your project go more smoothly). Note: the scores are for the proposal quality, not for the quality of the idea that is being proposed!
Scores are given mainly as a trigger so you can see the comments, comments for your specific proposal are given as a way for you to look over the broader list.
Again, the list of codes given to your proposal is not comprehensive - it’s just a few that seem to apply. (or that you are the first one that comment applies to)
The project ideas look really interesting. I am excited to see how things turn out! Hopefully this little bit of steering will help people stay on track.
Codes
General (applies to all problems)
G1 -
very thorough proposal- OK, Many deserve this. There were many different types of thoroughness. I didn’t always comment on it.G2 - it seems like this is exploring a particular data set, which might be OK as a “design prototype”, but that puts a higher standard on the design elements. There will be more pressure on your rationale and analysis to explain why your design is likely to work well (i.e., really address tasks) and to scale/generalize to other data sets
G3 - use of prototyping tools is a good thing.
G4 - good support of design ideas using references/literature
G5 - very detailed plan shows “thinking ahead” even if the plan isn’t followed exactly.
G6 - it may be good to decide early on if the project is implementation oriented, or just “sketches”.
G7 - exemplar “stories” are a good way to evaluate
G8 - the design seems plausible, but I am not sure I really understand it. Make sure to have examples that show how it addresses the tasks.
G9 - this is an intriguing design. I have no idea how well it would work, or how to put together the details to make it fully functional; but it seems a cool thing to try.
G10 - intriguing design ideas. I look forward to seeing how how you realize them!
G11 - don’t get too hung up on getting a specific data set.
G12 - think about how the design addresses the tasks (and be sure to articulate tasks)
G13 - readable by others is a gold standard for evaluation, but requires experiments which we probably cannot run as part of a class project.
G14 - I like the “try lots of things and assess” - it’s a good design process.
G15 - Proposal is too generic to show that you have ideas about what to do.
G16 - The basic idea is straightforward, and there are examples of it in the literature. But a nice implementation and details and rationale/analysis can be a good project.
Subgroup
S1 - a goal of the problem is to scale to many combinations of variables. Requiring the user to go through variables one at a time (or pick specific combinations) may not scale to allowing the user to look over many different potential subgroups. Be sure to consider how you will allow the user to scan over the combinatorially large number of subgroups.
S2 - beware ordering effects if you pick variables in a sequence. For example, tree designs (there were several) will be very different depending on the order of variables.
S3 - the user may not know what subgroup is interesting - they may need help sifting through a large number of combinations. If the user needs to specify a subgroup, this might make for a lot of trial and error.
S4 - some projects focused on comparing an identified subgroup to others; this is a variant of the proposed problem (which focused on identifying the subgroup to look at and determining which subgroups are worth looking at - see S3). It is OK to have this focus, providing there is at least some attention paid to S3.
Dimensionality Reduction
D1 - Unclear how the technique relates to the questions at hand.
D2 - Seems to be more about dimensionality reduction analaysis - be sure to bring in visualization concepts.
D3 - Good focus on a specific problem.
D4 - You introduce some metrics, do you have motivation for why these are important to viewers?
D5 - I am concerned that you don’t have examples of your specific problem. This is a case where coming up with real data examples where your methods apply will be important.
D6 - Using interaction to select points/clusters to probe is a good idea. Can you help a user who doesn’t know where to start?
D7 - Comparing the properties of clusters in high dimensions to their low dimensional counterparts is a good idea (e.g., relative size) - but requires metrics that can measured in both spaces.
D8 - Bias may be more a function of training data set than algorithm; but the standard sets probably have things that have been identified.
D9 - Be clear/careful about what “corresponds” means between HD and 2D data.
D10 - This seems to require a non-trivial analytical piece that I am not sure how standard it is (i.e., do you need to invent it yourself). If you do invent it, it might require some thought to determine how meaningful/rigorous it is.
D11 - This seems to imply there are clusters in the original high-dimensional data (which may or may not be clean).
Tiny Charts
For what its worth, Stacked Bar Charts are the most common problem - although there is some interesting diversity in the approaches. It will be extremely interesting to compare and contrast what people come up with!
T1 - Considering issues with aspect ratio (not just size) is something I had not thought of, and is a good idea! (some described this as “width and height separately”)
T2 - Consider how you will recognize if something is too small/hard to be seen (e.g., a slice of a stacked bar). Can you do this with the data, or will you need to render the image first? Analyzing a rendered image is hard.
T3 - Not specific about which design to consider (or considers multiple designs). It may be best to do one first (unless there really is some general principal that extends across multipe).
T4 - Clustering to reduce items to groups is a good idea! Beware that clustering doesn’t always work as planned.
T5 - Coming up with new color sets may be tricky; averaging colors doesn’t always work as expected.
T6 - Consider how to recognize what the distribution is. This is sometimes called “model selection”. Often an advantage of visualization (over statistics) is to show how the data doesn’t really fit the model.
T7 - Asking the user what they care about is an intriguing idea,and gets around (T2,T6). Consider how to make this easy for a viewer who may not want to think about it, or may not have visualization experience.
T8 - Awareness of many issues that need to be dealt with. Ideally, you will address them all; but even showing how addressing some of the issues cause the others to come out in the self-assessment will be good.
T9 - This kind of chart can be tough even at “full size”! “Tiny” might mean “not huge”.
T10 - Interaction is interesting, it is OK to include it, but consider that interaction is challenging in many tiny chart scenarios.
T11 - Thinking about this as a scatterplot opens up a rich literature of potential solutions, not clear which ones will be effective for tiny versions of charts.
T12 - The generalization of this could be interesting (maps as the latent spaces of dimensionality reduction, and showing graphs of how different regions behave).