CSE6242 / CX4242, Fall 2017
Data and Visual Analytics
Georgia Tech, College of Computing

Project


Grading & Schedule

  1. Proposal (10%)
  2. Proposal presentation (10%)
  3. Progress report (15%)
  4. Final poster presentation (10%)
  5. Final report (55%)
See course homepage's schedule table for all deliverables' due dates.

Important You will be submitting multiple files as part of your project deliverables. We will deduct 5% from a project deliverable for every file whose filename or file format that is different from what we have specified. It is time consuming for us to find "missing" files or to guess their names.

For example, suppose the final report requires README.txt and report.pdf; if your team submit README.doc and report.doc, 10% will be deducted from the final report's score.

Teaming Important!

The work will be carried out in teams of 4-6 persons.

A team may consist of both on-campus and distance learning students (Q or Q3 section). All such teams will have 3-day lags for all their deliverables. For proposal presentation and final presentation, those teams can choose to do that in class (physically) or submit videos (see details below).

We will grade grad projects and undergrad projects separately; we generally expect grad projects to include more detailed analysis, comprehensive results, etc.

Polo recommends each group to consist of either all grads or all undergrads. The main reason is that grads and undergrads have different expectations and work schedules. If you want to form a group with both grads and undergrads:
  1. We will grade that group as if all members are grad students
  2. Every member MUST fully understand the potential challenges in coordinating work schedules (e.g., grads usually take class on TH, undergrads on MWF) and expectations (e.g., course grades are generally very important for undergrads)

It is optional for an auditor to work on a project. When an auditor joins a team, the auditor MUST contribute to the team as an enrolled student, and the whole team will be graded as if everyone is enrolled, otherwise it will be unfair to other teams — every team member must fully understand and accept this requirement.

Choosing a Topic

Pick your own topic:

Harder way:

Once you have selected a topic, you should do some background reading so that you are capable of describing, in some detail, what you expect to accomplish. For example, if you decide that you want to implement some new proposal for a multidimensional file structure, you will have to carefully read the paper that proposes similar structures, pinpoint their weaknesses, and explain how your approach will address these weaknesses. Once you have read up on your topic, you will be ready to write your proposal.

Proposal

Your proposal should answer Heilmeier's questions (all 9 of them; see list below); if you think a question is not very relevant, briefly explain why. In other words, your proposal should describe what you plan to do (the problem to address), why you want to do it, how you will do it (what tools? e.g., SQLite, PostgreSQL, Hadoop, Kinect, iPad, etc.), how your approach is better than the state of the art, why it may succeed, and when it does, what differences will it make, how you will measure success, how long it's gonna take, etc.

9 Heilmeier questions (source)
  1. What are you trying to do? Articulate your objectives using absolutely no jargon.
  2. How is it done today; what are the limits of current practice?
  3. What's new in your approach? Why will it be successful?
  4. Who cares?
  5. If you're successful, what difference and impact will it make, and how do you measure them (e.g., via user studies, experiments, groundtruth data, etc.)?
  6. What are the risks and payoffs?
  7. How much will it cost?
  8. How long will it take?
  9. What are the midterm and final "exams" to check for success? How will progress be measured.

Your proposal should be no more than 1000 words, excluding titles, section names, reference list, etc., but including the literature survey. It should use 12pt font, typed in PDF format (can be created using any software, e.g., LaTeX, Word), and with figures, tables, etc. whenever useful. It should be self-contained. For example, don't just say: "We plan to implement Smith's Foo-Tree data structure [Smith86], and we will study its performance." Instead, you should briefly review the key ideas in the references, and describe clearly the alternatives that you will be examining.

An appendix is for optional, non-essential information. We may not read or even grade it. Please do NOT put your survey in an appendix.
Some teams, especially those that want to turn their project into a research publication, use LaTeX for type formatting. If your team chooses to go this route, you may consider using tools like Git (GT GitHub) or Overleaf to work on the article collaboratively. For the LaTeX template, we suggest ACM's standard template (sigconf). You may need to increase the template's default font size to 12pt (e.g., by changing "\def\ACM@fontsize{10pt}%" in the acmart.cls). You are also welcome to use other templates (e.g., IEEE, Springer).
How to write the survey without using too many words?

Grading scheme & Submission instructions

Proposal Presentation

Grading

Tips

Progress Report

This should be no more than 1600 words, 12pt font, typed.

It mainly serves as a checkpoint, to detect and prevent dead-ends and other problems early on.

It should consist of the same sections as your final report (introduction, survey, etc), with a few sections "under construction", describing the work performed up to then, and the revised plans for the whole project.

Specifically, the introduction and survey sections should be in their final form. The section on the proposed method should be almost finished. The sections about experiments and conclusions will have whatever results you have obtained, as well as `place-holders' for the results you plan/hope to obtain.

The progress report may be written based on your proposal. For example, the survey in the progress report is not required to be identical to the survey in the proposal. That is, you may update the proposal's survey as needed. Of course, the number of papers should not drop below the requirement (3 papers/team member), and the quality of discussion should still be equal or better than that in the proposal.
An appendix is for optional, non-essential information. We may not read or even grade it. Please do NOT put your survey in an appendix.

Grading scheme & Submission instructions

Final Poster Presentation Peer-graded

Logistics

Each team will create a single poster (for the whole team).

Each student will be a presenter once, and be a grader several times:
  1. Each student will present his/her team's poster during the poster session. Each student will have 3 minutes to present the poster, and 1 minute for Q&A. All team members must attend the poster session. If you cannot attend part of it, you must write to the instructor at least 5 days before the presentation, or you will receive 0 as your presentation score.
  2. Each student will also grade several presentations given by students in other teams during the poster session. A student's presentation score will be the average of the scores that he/she receives. Before the poster session, we will let each student know when he/she will present, and when he/she will grade others.
Thus, every team member should know his/her project very well, and be prepared to answer questions.

Demo: Optional but encouraged. The demo time counts towards the presentation time. If you decide to give a demo, please bring your own laptop. Assume there will little or no internet connection, and no ready access to power outlets.

Who will attend: We plan to open the session up to everybody (College of Computing, etc.).

Everyone is welcome to walk around to see other teams' posters.

Poster Design

Design and print the poster *well before* your presentation day, to avoid last-minute rush.

The poster must be in portrait orientation, 30 inches wide and 40 inches tall. Foam core poster boards, push pins, and easels will be provided to you to mount the poster. We suggest 18pt font size and larger.

A deck of PowerPoint slides is not acceptable as a poster. However, you may print your design on multiple smaller sheets of paper and then carefully stitch them together. See the illustration below for what is allowed and what is not. Examples of what is OK and what is not for poster design

Your poster should cover the following parts (point distribution shown on the left).
10% Motivation/Introduction:
5% What is the problem (no jargon)?
5% Why is it important and why should we care?
20% Your approaches (algorithm and interactive visualization):
5% What are they?
5% How do they work?
5% Why do you think they can effectively solve your problem (i.e., what is the intuition behind your approaches)?
5% What is new in your approaches?
10% Data:
5% How did you get it? (Download? Scrape?)
5% What are its characteristics (e.g., size on disk, # of records, temporal or not, etc.)
25% Experiments and results:
5% How did you evaluate your approaches?
10% What are the results?
10% How do you methods compare to other methods?
10% Presentation delivery:
5% Finished on time?
5% Spoke clearly and at a good pace?
25% Poster Design:
5% Layout/organization (Clear headings? Easy to follow?)
5% Use of text (Succinct or verbose?)
5% Use of graphics (Are they relevant? Do they help you better understand the project's approaches and ideas?)
5% Legibility (Is the text and figures too small?)
5% Grammar and spelling
If you team has Q or Q3 students. the team can choose to
  1. Present in class, physically
  2. Submit individual 3-minute video presentations (one presentation per student) through T-Square via the entry “Poster Video - Q”. The standard 3-day lag applies.
    Name each video recording teamXXposter-YY.mp4 (or .avi or .mov), where XX is the team number (e.g., 01 for team 1), and YY is the student's last name (e.g., smith).

    Project slip days CANNOT be used for this video presentation submission, since on-campus students cannot use their project slip days either (they all present on the same day).

    Your video should show your poster (e.g., as pdf on your computer screen via screen capture, say using Quicktime, MonoSnap, etc.) with voice narration; it is up to you whether to show your face. You should be able to create this recording quickly with little effort – no need to do any special video or audio editing.

Possible software to create posters

  1. Powerpoint/Word (save as pdf) -- GT's Office365 Powerpoint supports collaboration.
  2. Apple Pages (FREE) supports real-time collaboration (via iCloud and desktop software)
  3. Inkscape (free, cross platform)
  4. Polo uses Affinity Designer (Mac and windows)
  5. Adobe Illustrator is pretty good and available for limited trial and also installed on Library Mac Mini.

Where to print posters?

  1. Paper and clay. http://studentcenter.gatech.edu/seedo/paperandclay/Pages/default.aspx
  2. Poster printing is available for free at the GVU, but you have to physically go to the machine, log in, and upload your pdf http://gvu.gatech.edu/wiki/index.php/Poster_Printing_FAQ
  3. Poster printing is also available the library http://librarycommons.gatech.edu/lwc/multimedia.php
  4. PCS - more expensive http://www.oit.gatech.edu/rm/service/print-copy/print-and-copy-services
Above info curated from previous classes. Thanks Yin-Shu Kuo, Jitesh Jagadish!

Example posters

Apolo graph exploration poster GLO-STIX poster Insider trading pattern discovery poster MMap poster comment spam detection poster

Final Report

It will be a detailed description of what you did, what results you obtained, and what you have learned and/or can conclude from your work.

Components:

  1. Writeup: no more than 2800 words, 12pt font, typed. Describe in depth the novelties of your approach and your discoveries/insights/experiments, etc. 
  2. Software: packaging, documentation, and portability. The goal is to provide enough material, so that other people can use it and continue your work, if you are to open-source it -- in other words, you should make it easy and attractive for others to use your work.

Grading scheme & Submission instructions


Should datasets be included as part of our submission?

If you are referring to (small) toy data for a demo (that we/TAs will run), you are welcome to include them. Think about the open-source software libraries that you have seen or have used, they would often include some sort of "quick start" guide to get a demo running on a toy dataset.

For large datasets, please do not include them; if the dataset is public and can be easily downloaded, include the link to the dataset.

If getting a dataset requires writing scripts/programs, include those scripts, and write down the steps that people will need to go through (e.g., register for an account to get API key).

If you have processed the dataset in some ways, include the code you used, and the steps people will need to go through.